Method for the complete chemical synthesis and assembly of genes and genomes

ABSTRACT

The present invention relates generally to the fields of oligonucleotide synthesis. More particularly, it concerns the assembly of genes and genomes of completely synthetic artificial organisms. Thus, the present invention outlines a novel approach to utilizing the results of genomic sequence information by computer directed gene synthesis based on computing on the human genome database. Specifically, the present invention contemplates and describes the chemical synthesis and resynthesis of genes defined by the genome sequence in a host vector and transfer and expression of these sequences into suitable hosts.

BACKGROUND OF THE INVENTION

[0001] 1. Field of the Invention

[0002] The present invention relates generally to the fields ofoligonucleotide synthesis. More particularly, it concerns the assemblyof genes and genomes of completely synthetic artificial organisms.

[0003] 2. Description of Related Art

[0004] Present research and commercial applications in molecular biologyare based upon recombinant DNA developed in the 1970's. A critical facetof recombinant DNA is molecular cloning in plasmids, covered underseminal patent of Cohen and Boyer (U.S. Pat. No. 4,740,470 “Biologicallyfunctional molecular chimeras”). This patent teaches a method for the“cutting and splicing” of DNA molecules based upon restrictionendonucleases, the introduction of these “recombinant” molecules intohost cells, and their replication in the bacterial hosts. This techniqueis the basis of all molecular cloning for research and commercialpurposes carried out for the past 20 years and the basis of the field ofmolecular biology and genetics.

[0005] Recombinant DNA technology is a powerfull technology, but islimited in utility to modifications of existing DNA sequences which aremodified through 1) restriction enzyme cleavage sites, 2) PAC primers'for amplification, 3) site-specific mutagenesis, and other techniques.The creation of an entirely new molecule, or the substantialmodification of existing molecules, is extremely time consuming,expensive, requires complex and multiple steps, and in some cases isimpossible. Recombinant DNA technology does not permit the creation ofentirely artificial molecules, genes, genomes or organisms, but onlymodifications of naturally-occurring organisms.

[0006] Current biotechnology for industrial production, for drug designand development, for potential applications of vaccine development andgenetic therapy, and for agricultural and environmental use ofrecombinant DNA, depends on naturally-occurring organisms and DNAmolecules. To create or engineer new or novel functions, or to modifyorganisms for specialized use (such as producing a human hormone),requires substantially complex, time consuming and difficultmanipulations of naturally-occurring DNA molecules. In some cases,changes to naturally-occurring DNA are so complex that they are notpossible in practice. Thus, there is a need for technology that allowsthe creation of novel DNA molecules in a single step without requiringthe use of any existing recombinant or naturally-occurring DNA.

SUMMARY OF THE INVENTION

[0007] The present invention addresses the limitations in presentrecombinant nucleic acid manipulations by providing a fast, efficientmeans for generating practically any nucleic acid sequence, includingentire genes, chromosomal segments, chromosomes and genomes. Becausethis approach is based on an completely synthetic approach, there are nolimitations, such as the availability of existing nucleic acids, tohinder the construction of even very large segments of nucleic acid.

[0008] Thus, in a first embodiment, there is provided a method for theconstruction of a double-stranded DNA segment comprising the steps of(i) providing two sets of single-stranded oligonucleotides, wherein (a)the first set comprises the entire plus strand of said DNA segment, (b)the second set comprises the entire minus strand of said DNA segment,and (c) each of said first set of oligonucleotides being complementaryto two oligonucleotides of said second set of oligonucleotides, (ii)annealing said first and said second set of oligonucleotides, and (iii)treating said annealed oligonucleotides with a ligating enzyme. Optionalsteps provide for the synthesis of the oligonucleotide sets and thetransformation of host cells with the resulting DNA segment.

[0009] In particular embodiments, the DNA segment is 100, 200, 300, 400,800, 100, 1500, 200, 4000, 8000, 10000, 12000, 18,000, 20000, 40,000,80,000; 100,000, 10⁶, 10^(7, 10) ^(8, 10) ⁹ or more base pairs inlength. Indeed, it is contemplated that the methods of the presentinvention will be able to create entire artificial genomes of lengthscomparable to known bacterial, yeast, viral, mammalian, amphibian,reptilian, avian genomes. In more particular embodiments, the DNAsegment is a gene encoding a protein of interest. The DNA segmentfurther may include non-coding elements such as origins of replication,telomeres, promoters, enhancers, transcription and translation start andstop signals, introns, exon splice sites, chromatin scaffold componentsand other regulatory sequences. The DNA segment may comprises multiplegenes, chromosomal segments, chromosomes and even entire genomes. TheDNA segments may be derived from prokaryotic or eukaryotic sequencesincluding bacterial, yeast, viral, mammalian, amphibian, reptilian,avian, plants, archebacteria and other DNA containing living organisms.

[0010] The oligonucleotide sets preferably are comprisedoligonucleotides of between about 15 and 100 bases and more preferablybetween about 20 and 50 bases. Specific lengths include, but are notlimited to 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65,66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83,84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99 and 100.Depending on the size, the overlap between the oligonucleotides of thetwo sets may be designed to be between 5 and 75 bases peroligonucleotide pair.

[0011] The oligonucleotides preferably are treated with polynucleotidekinase, for example, T4 polynucleotide kinase. The kinasing can beperformed prior to mixing of the oligonucleotides set or after, butbefore annealing. After annealing, the oligonucleotides are treated withan enzyme having a ligating function. For example, a DNA ligasetypically will be employed for this function. However, topoisomerase,which does not require 5′ phosphorylation, is rapid and operates at roomtemperature, and may be used instead of ligase.

[0012] In a second embodiment, there is provided a method forconstruction of a double-stranded DNA segment comprising the steps of(i) providing two sets of single-stranded oligonucleotides, wherein (a)the first set comprises the entire plus strand of said DNA segment, (b)the second set comprises the entire minus strand of said DNA segment,and (c) each of said first set of oligonucleotides being complementaryto two oligonucleotides of said second set of oligonucleotides, (ii)annealing pairs of complementary oligonucleotides to produce a set offirst annealed products, wherein each pair comprises an oligonucleotidefrom each of said first and said second sets of oligonucleotides, (iii)annealing pairs of first annealed products having complementarysequences to produce a set of second annealed products, (iv) repeatingthe process until all annealed products have been annealed into a singleDNA segment, and (v) treating said annealed products with ligatingenzyme.

[0013] In a third embodiment, there is provided a method for theconstruction of a double-stranded DNA segment comprising the steps of(i) providing two sets of single-stranded oligonucleotides, wherein (a)the first set comprises the entire plus strand of sand DNA segment, (b)the second set comprises the entire minus strand of said DNA segment,and (c) each of said first set of oligonucleotides being complementaryto two oligonucleotides of said second set of oligonucleotides, (ii)annealing said the 5′ terminal oligonucleotide of said first set ofoligonucleotide with the 3′ terminal oligonucleotide of said second setof oligonucleotides, (iii) annealing the next most 5′ terminaloligonucleotide of said first set of oligonucleotides with the productof step (ii), (iv) annealing the next most 3′ terminal oligonucleotideof said second set of oligonucleotides with the product of step (iii),(v) repeating the process until all oligonucleotides of said first andsaid second sets have been annealed, and (vi) treating said annealedoligonucleotides with ligating enzyme. Optional steps provide for thesynthesis of the oligonucleotide sets and the transformation of hostcells with the resulting DNA segment. In a preferred embodiment, the 5′terminal oligonucleotide of the first set is attached to a support,which process may include the additional step of removing the DNAsegment from the support. The support may be any support known in theart, for example, a microtiter plate, a filter, polystyrene beads,polystyrene tray, magnetic beads, agarose and the like.

[0014] Annealing conditions may be adjusted based on the particularstrategy used for annealing, the size and composition of theoligonucleotides, and the extent of overlap between the oligonucleotidesof the first and second sets. For example, where all theoligonucleotides are mixed together prior to annealing, heating themixture to 80° C., followed by slow annealing for between 1 to 12 h isconducted. Thus, annealing may be conducted for about 2, about 3, about4, about 5, about 6, about 7, about 8, about 9, or about 10 h. However,in other embodiments, the annealing time may be as long as 24 h.

[0015] With the aid of a computer, the inventor is able to directsynthesis of a vector/gene combination using a high throughputoligonucleotide synthesizer as a set of overlapping componentoligonucleotides. The oligonucleotides are assembled using a roboticcombinatoric assembly strategy and the assembly ligated using DNA ligaseor topoisomerase, followed by transformation into a suitable hoststrain. In a particular embodiment, this invention generates a set ofbacterial strains containing a viable expression vector for all genes ina defined region of the genome. In other embodiments, a yeast orbaculovirus expression vector system is also contemplated to allowexpression of each gene in a chromosomal region in a eukaryotic host. Inyet another embodiment, it the present invention allows one of skill inthe art to devise a “designer gene” strategy wherein a gene or genomesor virtually any structure may be readily designed, synthesized andexpressed. Thus, eventually the technology described herein may beemployed to create entire genomes for introduction into host cells forthe creation of entirely artificial designer living organisms.

[0016] In specific embodiments, the present invention provides a methodfor the synthesis of a replication-competent, double-strandedpolynucleotide, wherein the polynucleotide comprises an origin ofreplication, a first coding region and a first regulatory elementdirecting the expression of the first coding region.

[0017] Additionally the method may further comprise the step ofamplifying the double-stranded polynucleotide. In specific embodiments,the double-stranded polynucleotide comprises 100, 200, 300, 400, 500,600, 700, 800, 900, 1000, 5000, 10×10³, 20×10³, 30×10³, 40×10³, 50×10³,60×10³, 70×10³, 80×10³, 90×10³, 1×10⁴, 1×10⁵, 1×10⁶, 1×10⁷, 1×10⁸, 1×10⁹or 1×10¹⁰ base pairs in length. The first regulatory element may be apromoter. In certain embodiments, the double-stranded polynucleotidefurther comprises a second regulatory element, the second regulatoryelement being a polyadenylation signal. In yet further embodiments, thedouble-stranded polynucleotide comprises a plurality of coding regionsand a plurality of regulatory elements. Specifically, it is contemplatedthat the coding regions encode products that comprise a biochemicalpathway. In particular embodiments the biochemical pathway isglycolysis. More particularly, it is contemplated that the codingregions encode enzymes selected from the group consisting of hexokinase,phosphohexose isomerase, phosphofructokinase-1, aldolase,triose-phosphate isomerase, glyceraldehyde-3-phosphate dehydrogenase,phosphoglycerate kinase, phosphoglycerate mutase, enolase and pyruvatekinase enzymes of the glycolytic pathway.

[0018] In other embodiments, the biochemical pathway is lipid synthesis,cofactor synthesis. Particularly contemplated are synthesis of lipoicacid, riboflavin synthesis nucleotide synthesis the nucleotide may be apurine or a pyrimidine.

[0019] In certain other embodiments it is contemplated that the codingregions encode enzymes involved in a cellular process selected from thegroup consisting of cell division, chaperone, detoxification, peptidesecretion, energy metabolism, regulatory function, DNA replication,transcription, RNA processing and tRNA modification. In preferredembodiments, the energy metabolism is oxidative phosphorylation.

[0020] It is contemplated that the double-stranded polynucleotide is aDNA or an RNA. In preferred embodiments, the double-strandedpolynucleotide may be a chromosome. The double-stranded polynucleotidemay be an expression construct. Specifically, the expression constructmay be a bacterial expression construct, a mammalian expressionconstruct or a viral expression construct. In particular embodiments,the double-stranded polynucleotide comprises a genome selected from thegroup consisting of bacterial genome, yeast genome, viral genome,mammalian genome, amphibian genome and avian genome.

[0021] In those embodiments in which the genome is a viral genome, theviral genome may be selected from the group consisting of retrovirus,adenovirus, vaccinia virus, herpesvirus and adeno-associated virus.

[0022] The present invention further provides a method of producing aviral particle.

[0023] Another embodiment provides a method of producing an artificialgenome, wherein the chromosome comprises all coding regions andregulatory elements found in a corresponding natural chromosome. Inspecific embodiments, the corresponding natural chromosome is a humanmitochondrial genome. In other embodiments, the corresponding naturalchromosome is a chloroplast genome.

[0024] Also provided is a method of producing an artificial geneticsystem, wherein the system comprises all coding regions and regulatoryelements found in a corresponding natural biochemical pathway. Such abiochemical pathway will likely possess a group of enzymes that seriallymetabolize a compound. In particularly preferred embodiments, thebiochemical pathway comprises the activities required for glycolysis. Inother embodiments, the biochemical pathway comprises the enzymesrequired for electron transport. In still further embodiments, thebiochemical pathway comprises the enzyme activities required forphotosynthesis.

[0025] Other objects, features and advantages of the present inventionwill become apparent from the following detailed description. It shouldbe understood, however, that the detailed description and the specificexamples, while indicating preferred embodiments of the invention, aregiven by way of illustration only, since various changes andmodifications within the spirit and scope of the invention will becomeapparent to those skilled in the art from this detailed description.

BRIEF DESCRIPTION OF THE DRAWINGS

[0026] The following drawings form part of the present specification andare included to further demonstrate certain aspects of the presentinvention. The invention may be better understood by reference to one ormore of these drawings in combination with the detailed description ofspecific embodiments presented herein.

[0027]FIG. 1. Flow diagram of the Jurassic Park paradigm for thereassembly of living organisms.

[0028]FIG. 2. Flow diagram of the strategy of synthetic genetics.

[0029]FIG. 3. Flow diagram of the strategy for combinatoric assembly ofoligonucleotides into complete genes or genomes.

[0030]FIG. 4. Design of plasmid synlux4. The sequence of 4800 isannotated with the locations of lux A+B genes, neomycin/kanamycinphosphotransferase and pUC 19 sequences.

[0031]FIG. 5. List of component oligonucleotides derived from thesequence of Synlux4 in FIG. 4.

[0032]FIG. 6. Schema for the combinatoric assembly of synthetic plasmidsfrom component oligonucleotides.

[0033]FIG. 7. SynGene program for generating overlappingoligonucleotides sufficient to reassemble the gene or plasmid.

DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS

[0034] The complete sequence of complex genomes, including the humangenome, make large scale functional approaches to genetics possible. Thepresent invention outlines a novel approach to utilizing the results ofgenomic sequence information by computer directed gene synthesis basedon computing on the human genome database. Specifically, the-inventiondescribes chemical synthesis and resynthesis of genes for transfer ofthese genes into a suitable host cells.

[0035] The present invention provides methods that can be used tosynthesize de novo, DNA segments that encode sets of genes, eithernaturally occurring genes expressed from natural or artificial promoterconstructs or artificial genes derived from synthetic DNA sequences,which encodes elements of biological systems that perform a specifiedfunction or attribution of an artificial organism as well as entiregenomes. In producing such systems and genomes, the present inventionprovides the synthesis of a replication-competent, double-strandedpolynucleotide, wherein the polynucleotide has an origin of replication,a first coding region and a first regulatory element directing theexpression of the first coding region. By replication competent, it ismeant that the polynucleotide is capable of directing its ownreplication. Thus, it is envisioned that the polynucleotide will possessall the cis-acting signals required to facilitate its own synthesis. Inthis respect, the polynucleotide will be similar to a plasmid or avirus, such that once placed within a cell, it is able to be replicatedby a combination of the polynucleotide's and cellular functions.

[0036] Thus, using the techniques of the present invention, one of skillin the art can create an artificial genome that is capable of encodingall the activities required for sustaining its own existence. Alsocontemplated are artificial genetic systems that are capable of encodingenzymes and activities of a particular biochemical pathway. In such asystem, it will be desirable to have all the activities present suchthat the whole biochemical pathway will operate. The co-expression of aset of enzymes required for a particular pathway constitutes a completegenetic or biological system. For example, the co-expression of theenzymes involved in glycolysis constitutes a complete genetic system forthe production of energy in the form of ATP from glucose. Such systemsfor energy production may include groups of enzymes which naturally orartificially serially metabolize a set of compounds.

[0037] The types of biochemical pathways would include but are notlimited to those for the biosynthesis of cofactors prosthetic groups andcarriers (lipoate synthesis, riboflavin synthesis pyridine nucleotidesynthesis); the biosynthesis of the cell envelopes (membranes,lipoproteins, porins, surface polysaccharides, lipopolysaccharides,antigens and surface structures); cellular processes including celldivision, chaperones, detoxification, protein secretion, centralintermediary metabolism (energy production vi phosphorus compounds andother); energy metabolism including aerobic, anaerobic, ATP protonmotive force interconversions, electron transport, glycolysis triosephosphate pathway, pyruvate dehydrogenase, sugar metabolism; purine,pyrimidine nucleotide synthesis, including 2′deoxyribonucleotidesynthesis, nucleotide and nucleoside interconversion, salvage ofnucleoside and nucleotides, sugar-nucleotide biosynthesis andconversion; regulatory functions including transcriptional andtranslational controls, DNA replication including degradation of DNA,DNA replication, restriction modification, recombination and repair;transcription including degradation of DNA, DNA-dependent RNA polymeraseand transcription factors; RNA processing; translation including aminoacyl tRNA synthetases, degradation of peptides and glycopeptides,protein modification, ribosome synthesis and modification, tRNAmodification; translation factors transport and binding proteinsincluding amino acid, peptide, amine carbohydrate, organic alcohol,organic acid and cation transport; and other systems for the adaptation,specific function or survival of an artificial organism.

[0038] A. Definitions

[0039] DNA segment—a linear piece of DNA having a double-stranded regionand both 5′- and 3′-ends; the segment may be of any length sufficientlylong to be created by the hybridization of at least two oligonucleotideshave complementary regions.

[0040] Oligonucleotides—small DNA segments, single-stranded ordouble-stranded, comprised of the nucleotide bases A, T, G and C linkedthrough phosphate bonds; oligonucleotides typically range from about 10to 100 base pairs.

[0041] Plus strand—by convention, the single-strand of a double-strandedDNA that starts with the 5′ end to the left as one reads the sequence.

[0042] Minus strand—by convention, the single-strand of adouble-stranded DNA that starts with the 3′ end to the left as one readsthe sequence.

[0043] Complementary—where two nucleic acids have at least a portion oftheir sequences, when read in opposite (5′→3′; 3′→5′) direction, thatpair sequential nucleotides in the following fashion: A-T, G-C, T-A,G-C.

[0044] Oligonucleotide sets—a plurality of oligonucleotides that, takentogether, comprise the sequence of a plus or minus strand of a DNAsegment.

[0045] Annealed products—two or more oligonucleotides havingcomplementary regions, where they are permitted, under properconditions, to base pair, thereby producing double stranded regions.

[0046] B. The Present Invention

[0047] The present invention describes methods for enabling the creationof DNA molecules, genomes and entire artificial living organisms basedupon information only, without the requirement for existing genes, DNAmolecules or genomes.

[0048] The methods of the present invention are diagrammed in FIG. 1 andFIG. 2 and generally involve the following steps. Generally, usingsimple computer software, comprising sets of gene parts and functionalelements it is possible to construct a virtual polynucleotide in thecomputer. This polynucleotide consists of a string of DNA bases, G, A, Tor C, comprising for example an entire artificial genome in a linearstring. For transfer of the synthetic gene into for example, bacterialcells the polynucleotide should contain the sequence for a bacterial(such as pBR322) origin of replication. For transfer into eukaryoticcells, it should contain the origin of replication of a mammalian virus,chromosome or subcellular component such as mitochondria.

[0049] Following construction, simple computer software is then used tobreak down the genome sequence into a set of overlappingoligonucleotides of specified length. This results in a set of shorterDNA sequences which overlap to cover the entire genome in overlappingsets. Typically, a gene of 1000 bases pairs would be broken down into 20100-mers where 10 of these comprise one strand and 10 of these comprisethe other strand. They would be selected to overlap on each strand by 25to 50 base pairs.

[0050] This step is followed by direction of chemical synthesis of eachof the overlapping set of oligonucleotides using an array typesynthesizer and phosphoamidite chemistry resulting in an array ofsynthesized oligomers. The next step is to balance concentration of eacholigomer and pool the oligomers so that a single mixture contains equalconcentrations of each. The mixed oligonucleotides are treated with T4polynucleotide kinase to 5′ phosphorylate the oligonucleotides. The nextstep is to carry out a “slow” annealing step to co-anneal all of theoligomers into the sequence of the predicted gene or genome. This isdone by heating the mixture to 80° C., then allowing it to cool slowlyto room temperature over several hours. The mixture of oligonucleotidesis then treated with T4 DNA ligase (or alternatively topoisomerase) tojoin the oligonucleotides. The oligonucleotides are then transferredinto competent host cells.

[0051] The above technique represents a “combinatorial” assemblystrategy where all oligonucleotides are jointly co-annealed bytemperature-based slow annealing. A variation on this strategy, whichmay be more suitable for very long genes or genomes, such as greaterthan 5,000 base pairs final length, is as follows. Using simple computersoftware, comprising sets of gene parts and functional elements, avirtual gene or genome is constructed in the computer. This gene orgenome would consist of a string of DNA bases, G, A, T or C, comprisingthe entire genome in a linear string. For transfer of the synthetic geneinto bacterial cells, it should contain the sequence for a bacterial(such as pBR322) origin of replication.

[0052] The next step is to carry out a ligation chain reaction using anew oligonucleotide addition each step. With this procedure, the firstoligonucleotide in the chain is attached to a solid support (such as anagarose bead). The second is added along with DNA ligase, and annealingand ligation reaction carried out, and the beads are washed. The second,overlapping oligonucleotide from the opposite strand is added, annealedand ligation carried out. The third oligonucleotide is added andligation carried out. This procedure is replicated until alloligonucleotides are added and ligated. This procedure is best carriedout for long sequences using an automated device. The DNA sequence isremoved from the solid support, a final ligation (is circular) iscarried out, and the molecule transferred into host cells.

[0053] Alternatively, it is contemplated that if the ligation kineticsallow all the oligonucleotides may be placed in a mixture and ligationbe allowed to proceed. In yet another embodiment, a series of smallerpolynucleotides may be made by ligating 2, 3, 4, 5, 6, or 7oligonucleotides into one sequence and adding this to another sequencecomprising a similar number of oligonucleotides parts.

[0054] The ligase chain reaction (“LCR”), disclosed in EPO No. 320 308,is incorporated herein by reference in its entirety. In LCR, twocomplementary probe pairs are prepared, and in the presence of thetarget sequence, each pair will bind to opposite complementary strandsof the target such that they abut. In the presence of a ligase, the twoprobe pairs will link to form a single unit. By temperature cycling, asin PCR™, bound ligated units dissociate from the target and then serveas “target sequences” for ligation of excess probe pairs. U.S. Pat. No.4,883,750 describes a method similar to LCR for binding probe pairs to atarget sequence. The following sections describe these methods infurther detail.

[0055] C. Nucleic Acids

[0056] The present invention discloses the artificial synthesis ofgenes. In one embodiment of the present invention, the artificial genescan be transferred into cells to confer a particular function either asdiscrete units or as part of artificial chromosomes or genome. One willgenerally prefer to design oligonucleotideshaving stretches of 15 to 100nucleotides, 25 to 200 nucleotides or even longer where desired. Suchfragments may be readily prepared by, directly synthesizing the fragmentby chemical means as described below.

[0057] Accordingly, the nucleotide sequences of the invention may beused for their ability to selectively form duplex molecules withcomplementary stretches of genes or RNAs or to provide primers foramplification of DNA or RNA from tissues. Depending on the applicationenvisioned, one will desire to employ varying conditions ofhybridization to achieve varying degrees of hybrization selectivity.Typically high selectivity is favored.

[0058] For applications requiring high selectivity, one typically willdesire to employ relatively stringent conditions to form the hybrids,e.g., one will select relatively low salt and/or high temperatureconditions, such as provided by about 0.02 M to about 0.10 M NaCl attemperatures of about 50° C. to about 70° C. Such high stringencyconditions tolerate little, if any, mismatch between the oligonucleotideand the template or target strand. It generally is appreciated thatconditions can be rendered more stringent by the addition of increasingamounts of formamide.

[0059] For certain applications, for example, by analogy to,substitution of nucleotides by site-directed mutagenesis, it isappreciated that lower stringency conditions may be used. Under theseconditions, hybridization may occur even though the sequences of probeand target strand are not perfectly complementary, but are mismatched atone or more positions. Conditions may be rendered less stringent byincreasing salt concentration and decreasing temperature. For example, amedium stringency condition could be provided by about 0.1 to 0.25 MNaCi at temperatures of about 37° C. to about 55° C., while a lowstringency condition could be provided by about 0.15 M to about 0.9 Msalt, at temperatures ranging from about 20° C. to about 55° C. Thus,hybridization conditions can be readily manipulated depending on thedesired results.

[0060] In certain embodiments, it will be advantageous to deteriming thehybridization of ilogonucleotides by employing as a label. A widevariety of appropriate indicator means are known in the art, includingfluorescent, radioactive, enzymatic or other ligands, such asavidin/biotin, which are capable of being detected. In preferredembodiments, one may desire to employ a fluorescent label or an enzymetag such as urease, alkaline phosphatase or peroxidase, instead ofradioactive or other environmentally undesirable reagents. In the caseof enzyme tags, colorimetric indicator substrates are known that can beemployed to provide a detection means visible to the human eye orspectrophotometrically, to identify whether specific hybridization withcomplementary oligonucleotide has occured.

[0061] In embodiments involving a solid phase, for example the firstoligonucleotide is adsorbed or otherwise affixed to a selected matrix orsurface. This fixed, single-stranded nucleic acid is then subjected tohybridization with the complementary oligonucleotides under desiredconditions. The selected conditions will also depend on the particularcircumstances based on the particular criteria required (depending, forexample, on the G+C content, type of target nucleic acid, source ofnucleic acid, size of hybridization probe, etc.). Following washing ofthe hybridized surface to remove non-specifically boundoligonucleotides, the hybridization may be detected, or even quantified,by means of the label.

[0062] For applications in which the nucleic acid segments of thepresent invention are incorporated into vectors, such as plasmids,cosmids or viruses, these segments may be combined with other DNAsequences, such as promoters, polyadenylation signals, restrictionenzyme sites, multiple cloning sites, other coding segments, and thelike, such that their overall length may vary considerably. It iscontemplated that a nucleic acid fragment of almost any length may beemployed, with the total length preferably being limited by the ease ofpreparation and use in the intended recombinant DNA protocol.

[0063] DNA segments encoding a specific gene may be introduced intorecombinant host cells and employed for expressing a specific structuralor regulatory protein. Alternatively, through the application of geneticengineering techniques, subportions or derivatives of selected genes maybe employed. Upstream regions containing regulatory regions such aspromoter regions may be isolated and subsequently employed forexpression of the selected gene.

[0064] The nucleic acids employed may encode antisense constructs thathybridize, under intracellular conditions, to a nucleic acid ofinterest. The term “antisense construct” is intended to refer to nucleicacids, preferably oligonucleotides, that are complementary to the basesequences of a target DNA. Antisense oligonucleotides, when introducedinto a target cell, specifically bind to their target nucleic acid andinterfere with transcription, RNA processing, transport, translationand/or stability. Antisense constructs may be designed to bind to thepromoter and other control regions, exons, introns or even exon-intronboundaries of a gene.

[0065] Other sequences with lower degrees of homology also arecontemplated. For example, an antisense construct which has limitedregions of high homology, but also contains a non-homologous region(e.g., a ribozyme) could be designed. These molecules, though havingless than 50% homology, would bind to target sequences under appropriateconditions.

[0066] In certain embodiments, one may wish to employ antisenseconstructs which include other elements, for example, those whichinclude C-5 propyne pyrimidines. Oligonucleotides which contain C-5propyne analogues of uridine and cytidine have been shown to bind RNAwith high affinity and to be potent antisense inhibitors of geneexpression (Wagner et al., 1993).

[0067] According to the present invention, DNA segments of a variety ofsizes will be produced. These DNA segments will, by definition, belinear molecules. As such, they typically will be modified beforefurther use. These modifications include, in one embodiment, therestriction of the segments to produce one or more “sticky ends”compatible with complementary ends of other molecules, including thosein vectors capable of supporting the replication of the DNA segment.This manipulation facilitates “cloning” of the segments.

[0068] Typically, cloning involves the use of restriction endonucleases,which cleave at particular sites within DNA strands, to prepare a DNAsegment for transfer into a cloning vehicle. Ligation of the compatibleends (which include blunt ends) using a DNA ligase completes thereaction. Depending on the situation, the cloning vehicle may comprisesa relatively small portion of DNA, compared to the insert.Alternatively, the cloning vehicle may be extremely complex and includea variety of features that will affect the replication and function ofthe DNA segment. In certain embodiments, a rare cutter site may beintroduced into the end of the polynucleotide sequence.

[0069] Cloning vehicles include plasmids such as the pUC series,Bluescript™ vectors and a variety of other vehicles with multipurposecloning sites, selectable markers and origins of replication. Because ofthe nature of the present invention, the cloning vehicles may includesuch complex molecules as phagemids and cosmids, which hold relativelylarge pieces of DNA. In addition, the generation of artificialchromosomes, and even genomes.

[0070] Following cloning into a suitable vector, the construct then istransferred into a compatible host cell. A variety of different genetransfer techniques are described elsewhere in this document. Culture ofthe host cells for the intended purpose (amplification, expression,subcloning) follows.

[0071] Throughout this application, the term “expression construct” ismeant to include a particular kind of cloning vehicle containing anucleic acid coding for a gene product in which part or all of thenucleic acid encoding sequence is capable of being transcribed. Thetranscript may be translated into a protein, but it need not be. Thus,in certain embodiments, expression includes both transcription of a geneand translation of a RNA into a gene product. In other embodiments,expression only includes transcription of the nucleic acid, for example,to generate antisense constructs.

[0072] In preferred embodiments, the nucleic acid is undertranscriptional control of a promoter. A “promoter” refers to a DNAsequence recognized by the synthetic machinery of the cell, orintroduced synthetic machinery, required to initiate the specifictranscription of a gene. The phrase “under transcriptional control”means that the promoter is in the correct location and orientation inrelation to the nucleic acid to control RNA polymerase initiation andexpression of the gene.

[0073] The term promoter will be used here to refer to a group oftranscriptional control modules that are clustered around the initiationsite for RNA polymerase II. Much of the 10 thinking about how promotersare organized derives from analyses of several viral promoters,including those for the HSV thymidine kinase (tk) and SV40 earlytranscription units. These studies, augmented by more recent work, haveshown that promoters are composed of discrete functional modules, eachconsisting of approximately 7-20 bp of DNA, and containing one or morerecognition sites for transcriptional activator or repressor proteins.

[0074] At least one module in each promoter functions to position thestart site for RNA synthesis. The best known example of this is the TATAbox, but in some promoters lacking a TATA box, such as the promoter forthe mammalian terminal deoxynucleotidyl transferase gene and thepromoter for the SV40 late genes, a discrete element overlying the startsite itself helps to fix the place of initiation.

[0075] Additional promoter elements regulate the frequency oftranscriptional-initiation.

[0076] Typically, these are located in the region 30-110 bp upstream ofthe start site, although a number of promoters have recently been shownto contain functional elements downstream of the start site as well. Thespacing between promoter elements frequently is flexible, so thatpromoter function is preserved when elements are inverted or movedrelative to one another. In the tk promoter, the spacing betweenpromoter elements can be increased to 50 bp apart before activity beginsto decline. Depending on the promoter, it appears that individualelements can function either co-operatively or independently to activatetranscription.

[0077] The particular promoter that is employed to control theexpression of a nucleic acid is not believed to be critical, so long asit is capable of expressing the nucleic acid in the targeted cell. Thus,where a human cell is targeted, it is preferable to position the nucleicacid coding region adjacent to and under the control of a promoter thatis capable of being expressed in a human cell. Generally speaking, sucha promoter might include either a human or viral promoter. Preferredpromoters include those derived from HSV. Another preferred embodimentis the tetracycline controlled promoter.

[0078] In various other embodiments, the human cytomegalovirus (CMV)immediate early gene promoter, the SV40 early promoter and the Roussarcoma virus long terminal repeat can be used to obtain high-levelexpression of transgenes. The use of other viral or mammalian cellularor bacterial phage promoters which are well-known in the art to achieveexpression of a transgene is contemplated as well, provided that thelevels of expression are sufficient for a given purpose. It isenvisioned that any elements/promoters may be employed in the context ofthe present invention. Below is a list of viral promoters, cellularpromoters/enhancers and inducible promoters/enhancers that could be usedin combination with the nucleic acid encoding a gene of interest in anexpression construct. Enhancer/promoter elements contemplated for usewith the present invention include but are not limited to ImmunoglobulinHeavy Chain, Immunoglobulin Light, Chain T-Cell Receptor, HLA DQ α andDQ β, β-Interferon, Interleukin-2, Interleukin-2 Receptor, MHC Class II5, MHC Class II HLA-DRa, -Actin, Muscle Creatine Kinase, Prealbumin(Transthyretin), Elastase I, Metallothionein, Collagenase, Albumin Gene,α-Fetoprotein, τ-Globin, β-Globin, e-fos, c-HA-ras, Insulin, Neural CellAdhesion Molecule (NCAM), α1-Antitrypsin, H₂B (TH2B) Histone, Mouse orType I Collagen, Glucose-Regulated Proteins (GRP94 and GRP78), RatGrowth Hormone, Human Serum Amyloid A (SAA), Troponin I (TN I),Platelet-Derived Growth Factor, Duchenne Muscular Dystrophy, SV40,Polyoma, Retroviruses, Papilloma Virus, Hepatitis B Virus, HumanImmunodeficiency Virus, Cytomegalovirus, Gibbon Ape Leukemia Virus.Inducible promoter elements and their associated inducers are listed inTable 2 below. This list is not intended to be exhaustive of all thepossible elements involved in the promotion of transgene expression but,merely, to be exemplary thereof. Additionally, any promoter/enhancercombination (as per the Eukaryotic Promoter Data Base EPDB) could alsobe used to drive expression of the gene. Eukaryotic cells can supportcytoplasmic transcription from certain bacterial promoters if theappropriate bacterial polymerase is provided, either as part of thedelivery complex or as an additional genetic expression construct.

[0079] Enhancers were originally detected as genetic elements thatincreased transcription from a promoter located at a distant position onthe same molecule of DNA. This ability to act over a large distance hadlittle precedent in classic studies of prokaryotic transcriptionalregulation. Subsequent work showed that regions of DNA with enhanceractivity are organized much like promoters. That is, they are composedof many individual elements, each of which binds to one or moretranscriptional proteins.

[0080] The basic distinction between enhancers and promoters isoperational. An enhancer region as a whole must be able to stimulatetranscription at a distance; this need not be true of a promoter regionor its component elements. On the other hand, a promoter must have oneor more elements that direct initiation of RNA synthesis at a particularsite and in a particular orientation, whereas enhancers lack thesespecificities. Promoters and enhancers are often overlapping andcontiguous, often seeming to have a very similar modular organization.TABLE 2 Element Inducer MT II Phorbol Ester (TPA) Heavy metals MMTV(mouse mammary tumor Glucocorticoids virus) β-Interferon poly(rI)Xpoly(rc) Adenovirus 5 E2 Ela c-jun Phorbol Ester (TPA), H₂O₂ CollagenasePhorbol Ester (TPA) Stromelysin Phorbol Ester (TPA), IL-1 SV40 PhorbolEster (TPA) Murine MX Gene Interferon, Newcastle Disease Virus GRP78Gene A23187 α-2-Macroglobulin IL-6 Vimentin Serum MHC Class I Gene H-2kBInterferon HSP70 Ela, SV40 Large T Antigen Proliferin Phorbol Ester-TPATumor Necrosis Factor FMA Thyroid Stimulating Hormone α Thyroid HormoneGene

[0081] Use of the baculovirus system will involve high level expressionfrom the powerful polyhedron promoter. One will typically include apolyadenylation signal to effect proper polyadenylation of thetranscript. The nature of the polyadenylation signal is not believed tobe crucial to the successful practice of the invention, and any suchsequence may be employed. Preferred embodiments include the SV40polyadenylation signal and the bovine growth hormone polyadenylationsignal, convenient and known to function well in various target cells.Also contemplated as an element of the expression cassette is aterminator. These elements can serve to enhance message levels and tominimize read through from the cassette into other sequences.

[0082] A specific initiation signal also may be required for efficienttranslation of coding sequences. These signals include the ATGinitiation codon and adjacent sequences. Exogenous translational controlsignals, including the ATG initiation codon, may need to be provided.One of ordinary skill in the art would readily be capable of determiningthis and providing the necessary signals. It is well known that theinitiation codon must be “in-frame” with the reading frame of thedesired coding sequence to ensure translation of the entire insert. Theexogenous translational control signals and initiation codons can beeither natural or synthetic. The efficiency of expression may beenhanced by the inclusion of appropriate transcription enhancer elements(Bittner et al., 1987).

[0083] In certain embodiments, it may be desirable to includespecialized regions known as telomeres at the end of a genome sequence.Telomeres are repeated sequences found at chromosome ends and it haslong been known that chromosomes with truncated ends are unstable, tendto fuse with other chromosomes and are otherwise lost during celldivision. Some data suggest that telomeres interaction the nucleoproteincomplex and the nuclear matrix. One putative role for telomeres includesstabilizing chromosomes and shielding the ends from degradative enzyme.

[0084] Another possible role for telomeres is in replication. Accordingto present doctrine, replication of DNA requires starts from short RNAprimers annealed to the 3′-end of the template. The result of thismechanism is an “end replication problem” in which the regioncorresponding to the RNA primer is not replicated. Over many celldivisions, this will result in the progressive truncation of thechromosome. It is thought that telomeres may provide a buffer againstthis effect, at least until they are themselves eliminated by thiseffect. A further structure to be included in DNA segments is acentromere.

[0085] In certain embodiments of the invention, the delivery of anucleic acid in a cell may be identified in vitro or in vivo byincluding a marker in the expression construct. The marker would resultin an identifiable change to the transfected cell permitting easyidentification of expression.

[0086] A number of selection systems may be used, including, but notlimited, to the herpes simplex virus thymidine kinase (Wigler et al.,1977), hypoxanthine-guanine phosphoribosyltransferase (Szybalska et al.,1962) and adenine phosphoribosyltransferase genes (Lowy et al., 1980),in tk⁻, hgprt⁻ or aprt⁻ cells, respectively. Also, antimetaboliteresistance can be used as the basis of selection for dhfr, which confersresistance to methotrexate (Wigler et al., 1980; O'Hare et al., 1981);gpt, which confers resistance to mycophenolic acid (Mulligan et al.,1981); neo, which confers resistance to the aminoglycoside G-418(Colberre-Garapin et al., 1981); and hygro, which confers resistance tohygromycin.

[0087] Usually the inclusion of a drug selection marker aids in cloningand in the selection of transformants, for example, neomycin, puromycin,hygromycin, DHFR, GPT, zeocin and histidinol. Alternatively, enzymessuch as herpes simplex virus thymidine kinase (tk) (eukaryotic) orchloramphenicol acetyltransferase (CAT) (prokaryotic) may be employed.Immunologic markers also can be employed. The selectable marker employedis not believed to be important, so long as it is capable of beingexpressed simultaneously with the nucleic acid encoding a gene product.Further examples of selectable markers are well known to one of skill inthe art.

[0088] In certain embodiments of the invention, the use of internalribosome binding sites (IRES) elements are used to create multigene, orpolycistronic, messages. IRES elements are able to bypass the ribosomescanning model of 5′ methylated Cap dependent translation and begintranslation at internal sites (Pelletier and Sonenberg, 1988). IRESelements from two members of the picanovirus family (polio andencephalomyocarditis) have been described (Pelletier and Sonenberg,1988), as well an IRES from a mammalian message (Macejak and Sarnow,1991). IRES elements can be linked to heterologous open reading frames.Multiple open reading frames can be transcribed together, each separatedby an IRES, creating polycistronic messages. By virtue of the IRESelement, each open reading frame is accessible to ribosomes forefficient translation. Multiple genes can be efficiently expressed usinga single promoter/enhancer to transcribe a single message.

[0089] Any heterologous open reading frame can be linked to IRESelements. This includes genes for secreted proteins, multi-subunitproteins, encoded by independent genes, intracellular or membrane-boundproteins and selectable markers. In this way, expression of severalproteins can be simultaneously engineered into a cell with a singleconstruct and a single selectable marker.

[0090] D. Encoded Proteins

[0091] In this application, the inventors use genetic information forcreative or synthetic purposes. The complete genome sequence will give acatalog of all genes necessary for the survival, reproduction, evolutionand speciation of an organisms and, given suitable high tech tools, thegenomic information may be modified or even created from “scratch” inorder to synthesize life. Thus it is contemplated that a combination ofsuitable energy generation genes, regulatory genes, and other functionalgenes could be constructed which would be sufficient to render anartificial organism with the basic functionalities to enable independentsurvival.

[0092] To meet this goal, the present invention utilizes known cDNAsequences for any given gene to express proteins in an artificialorganism. Any protein so expressed in this invention may be modified forparticular purposes according to methods well known to those of skill inthe art. For example, particular peptide residues may be derivatized orchemically modified in order to alter the immune response or to permitcoupling of the peptide to other agents. It also is possible to changeparticular amino acids within the peptides without disturbing theoverall structure or antigenicity of the peptide. Such changes aretherefore termed “conservative” changes and tend to rely on thehydrophilicity or polarity of the residue. The size and/or charge of theside chains also are relevant factors in determining which substitutionsare conservtive.

[0093] Once the entire coding sequence of a gene has been determined,the gene can be inserted into an appropriate expression system. The genecan be expressed in any number of different recombinant DNA expressionsystems to generate large amounts of the polypeptide product, which canthen be purified and used to vaccinate animals to generate antisera withwhich further studies may be conducted.

[0094] Examples of expression systems known to the skilled practitionerin the art include bacteria such as E. coli, yeast such as Saccharomycescerevisia and Pichiapastoris, baculovirus, and mammalian expressionsystems such as in COS or CHO cells. In one embodiment, polypeptides areexpressed in E. coli and in baculovirus expression systems. A completegene can be expressed or, alternatively, fragments of the gene encodingportions of polypeptide can be produced.

[0095] In one embodiment, the gene sequence encoding the polypeptide isanalyzed to detect putative transmembrane sequences. Such sequences aretypically very hydrophobic and are readily detected by the use ofstandard sequence analysis software, such as MacVector (IBI, New Haven,Conn.). The presence of transmembrane sequences is often deleteriouswhen a recombinant protein is synthesized in many expression systems,especially E. coli, as it leads to the production of insolubleaggregates that are difficult to renature into the native conformationof the protein. Deletion of transmembrane sequences typically does notsignificantly alter the conformation of the remaining protein structure.

[0096] Moreover, transmembrane sequences, being by definition embeddedwithin a membrane, are inaccessible. Therefore, antibodies to thesesequences will not prove useful for in vivo or in situ studies. Deletionof transmembrane-encoding sequences from the genes used for expressioncan be achieved by standard techniques. For example,fortuitously-placedrestrictionenzyme sites can be used to excise thedesired gene fragment, or PCR™-type amplification can be used to amplifyonly the desired part of the gene. The skilled practitioner will realizethat such changes must be designed so as not to change the translationalreading frame for downstream portions of the protein-encoding sequence.

[0097] In one embodiment, computer sequence analysis is used todetermine the location of the predicted major antigenic determinantepitopes of the polypeptide. Software capable of carrying out thisanalysis is readily available commercially, for example MacVector (IBI,New Haven, Conn.). The software typically uses standard algorithms suchas the Kyte/Doolittle or Hopp/Woods methods for locating hydrophilicsequences which are characteristically found on the surface of proteinsand are, therefore, likely to act as antigenic determinants.

[0098] Once this analysis is made, polypeptides can be prepared thatcontain at least the essential features of the antigenic determinant andthat can be employed in the generation of antisera against thepolypeptide. Minigenes or gene fusions encoding these determinants canbe constructed and inserted into expression vectors by standard methods,for example, using PCR™ methodology.

[0099] The gene or gene fragment encoding a polypeptide can be insertedinto an expression vector by standard subcloning techniques. In oneembodiment, an E. coli expression vector is used that produces therecombinant polypeptide as a fusion protein, allowing rapid affinitypurification of the protein. Examples of such fusion protein expressionsystems are the glutathione S-transferase system (Pharmacia, Piscataway,N.J.), the maltose binding protein system (NEB, Beverley, Mass.), theFLAG system (IBI, New Haven, Conn.), and the 6xHis system (Qiagen,Chatsworth, Calif.).

[0100] Some of these systems produce recombinant polypeptides bearingonly a small number of additional amino acids, which are unlikely toaffect the antigenic ability of the recombinant polypeptide. Forexample, both the FLAG system and the 6×His system add only shortsequences, both of that are known to be poorly antigenic and which donot adversely affect folding of the polypeptide to its nativeconformation. Other fusion systems produce polypeptide where it isdesirable to excise the fusion partner from the desired polypeptide. Inone embodiment, the fusion partner is linked to the recombinantpolypeptide by a peptide sequence containing a specific recognitionsequence for a protease. Examples of suitable sequences are thoserecognized by the Tobacco Etch Virus protease (Life Technologies,Gaithersburg, Md.) or Factor Xa (New England Biolabs, Beverley, Mass.).

[0101] Recombinant bacterial cells, for example E. coli, are grown inany of a number of suitable media, for example LB, and the expression ofthe recombinant polypeptide induced by adding IPTG to the media orswitching incubation to a higher temperature. After culturing thebacteria for a further period of between 2 and 24 h, the cells arecollected by centrifugation and washed to remove residual media. Thebacterial cells are then lysed, for example, by disruption in a cellhomogenizer and centrifuged to separate the dense inclusion bodies andcell membranes from the soluble cell components. This centrifugation canbe performed under conditions whereby the dense inclusion bodies areselectively enriched by incorporation of sugars such as sucrose into thebuffer and centrifugation at a selective speed.

[0102] In another embodiment, the expression system used is one drivenby the baculovirus polyhedron promoter. The gene encoding thepolypeptide can be manipulated by standard techniques in order tofacilitate cloning into the baculovirus vector. One baculovirus vectoris the pBlueBac vector (Invitrogen, Sorrento, Calif.). The vectorcarrying the gene for the polypeptide is transfected into Spodopterafrugiperda (Sf9) cells by standard protocols, and the cells are culturedand processed to produce the recombinant antigen. See Summers et al., AMANUAL OF METHODS FOR BACULOVIRUS VECTORS AND INSECT CELL CULTUREPROCEDURES, Texas Agricultural Experimental Station.

[0103] In designing a gene that encodes a particular polypeptide, thehydropathic index of amino acids may be considered. Table 3 provides acodon table showing the nucliec acids that encode a particular aminoacid. The importance of the hydropathic amino acid index in conferringinteractive biologic function on a protein is generally understood inthe art (Kyte & Doolittle, 1982). The following is a brief discussion ofthe the hydropathic amino acid index for use in the present invention.TABLE 3 Amino Acids Codons Alanine Ala A GCA GCC GCG GCU Cysteine Gys CUGC UGU Aspartic acid Asp D GAC GAU Glutamic acid Glu E GAA GAGPhenylalanine Phe F UUC UUU Glycine Gly G GGA GGC GGG GGU Histidine HisH CAC CAU Isoleucine Ile I AUA AUG AUU Lysine Lys K AAA AAG Leucine LeuL UUA UUG CUA CUC CUG CUU Methionine Met M AUG Asparagine Asn N AAG AAUProline Pro P CCA CCC CCG CCU Glutamine Gln Q CAA CAG Arginine Arg R AGAAGG CGA CGC CGG CGU Serine Ser S AGC AGU UGA UCC UCG UCU Threonine Thr TACA ACC ACG ACU Valine Val V GUA GUC GUG GUU Tryptophan Trp W UGGTyrosine Tyr Y UAC UAU

[0104] It is accepted that the relative hydropathic character of theamino acid contributes to the secondary structure of the resultantprotein, which in turn defines the interaction of the protein with othermolecules, for example, enzymes, substrates, receptors, DNA, antibodies,antigens, and the like.

[0105] Each amino acid has been assigned a hydropathic index on thebasis of their hydrophobicity and charge characteristics (Kyte &Doolittle, 1982), these are: Isoleucine (+4.5); valine (+4.2); leucine(+3.8); phenylalanine (+2.8); cysteine/cystine (+2.5); methionine(+1.9); alanine (+1.8); glycine (−0.4); threonine (−0.7); serine (−0.8);tryptophan (−0.9); tyrosine (−1.3); proline (−1.6); histidine (−3.2);glutamate (−3.5); glutamine (−3.5); aspartate (−3.5);

[0106] asparagine (−3.5); lysine (−3.9); and arginine (−4.5).

[0107] It is known in the art that certain amino acids may besubstituted by other amino acids having a similar hydropathic index orscore and still result in a protein with similar biological activity,i.e., still obtain a biological functionally equivalent protein. Inmaking such changes, the substitution of amino acids whose hydropathicindices are within ±2 is preferred, those which are within +1 areparticularly preferred, and those within +0.5 are even more particularlypreferred.

[0108] It is also understood in the art that the substitution of likeamino acids can be made effectively on the basis of hydrophilicity. U.S.Pat. No. 4,554,101, incorporated herein by reference, states that thegreatest local average hydrophilicity of a protein, as governed by thehydrophilicity of its adjacent amino acids, correlates with a biologicalproperty of the protein.

[0109] As detailed in U.S. Pat. No. 4,554,101, the followinghydrophilicity values have been assigned to amino acid residues:arginine (+3.0); lysine (+3.0); aspartate (+3.0+1); glutamate (+3.0±1);serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0);threonine (−0.4); proline (−0.5±1); alanine (−0.5); histidine −0.5);cysteine (−1.0); methionine (−1.3); valine (−1.5); leucine (−1.8);isoleucine (−1.8); tyrosine (−2.3); phenylalanine (−2.5); tryptophan(−3.4).

[0110] It is understood that an amino acid can be substituted foranother having a similar hydrophilicity value and still obtain abiologically equivalent and immunologically equivalent protein. In suchchanges, the substitution of amino acids whose hydrophilicity values arewithin ±2 is preferred, those that are within ±1 are particularlypreferred, and those within ±0.5 are even more particularly preferred.

[0111] As outlined above, amino acid substitutions are generally basedon the relative similarity of the amino acid side-chain substituents,for example, their hydrophobicity, hydrophilicity, charge, size, and thelike. Exemplary substitutions that take various of the foregoingcharacteristics into consideration are well known to those of skill inthe art and include: arginine and lysine; glutamate and aspartate;serine and threonine; glutamine and asparagine; and valine, leucine andisoleucine.

[0112] E. Expression of and Delivery of Genes

[0113] I. Expression

[0114] Once the designer gene, genome or biological system has been madeaccording the methods described herein, the polynucleotides can beexpressed as encoded peptides or proteins of the gene, genome orbiological system. The engineering of the polynucleotides for expressionin a prokaryotic or eukaryotic system may be performed by techniquesgenerally known to those of skill in recombinant expression. Therefore,promoters and other elements specific to a bacterial mammalian or othersystem may be included in the polynucleotide sequence. It is believedthat virtually any expression system may be employed in the expressionof the claimed nucleic acid sequences.

[0115] The artificially generated polynucleotide sequences are suitablefor eukaryotic expression, as the host cell will generally process thegenomic transcripts to yield functional mRNA for translation intoprotein. It is believed that the use of a designer gene version willprovide advantages in that the size of the gene will generally be muchsmaller and more readily employed to transfect the targeted cell thanwill a genomic gene, which will typically be up to an order of magnitudelarger than the designer gene. However, the inventor does not excludethe possibility of employing a genomic version of a particular genewhere desired.

[0116] As used herein, the terms “engineered” and “recombinant” cellsare intended to refer to a cell into which an exogenous polynucleotidedescribed herein has been introduced. Therefore, engineered cells aredistinguishable from naturally-occurring cells which do not contain arecombinantly introduced exogenous polynucleotide. Engineered cells arethus cells having a gene or genes introduced through the hand of man.Recombinant cells include those having an introduced polynucleotides,and also include polynucleotides positioned adjacent to a promoter notnaturally associated with the particular introduced gene.

[0117] To express a recombinant encoded protein or peptide, whethermutant or wild-type, in accordance with the present invention one wouldprepare an expression vector that comprises one of the claimed isolatednucleic acids under the control of one or more promoters. To bring acoding sequence “under the control of” a promoter, one positions the 5′end of the translational initiation site of the reading frame generallybetween about 1 and 50 nucleotides “downstream” of (i.e., 3′ of) thechosen promoter. The “upstream” promoter stimulates transcription of theinserted DNA and promotes expression of the encoded recombinant protein.This is the meaning of “recombinant expression” in the context usedhere.

[0118] Many standard techniques are available to construct expressionvectors containing the appropriate nucleic acids andtranscriptional/translational control sequences in order to achieveprotein or peptide expression in a variety of host-expression systems.Cell types available for expression include, but are not limited to,bacteria, such as E. coli and B. subtilis transformed with recombinantphage DNA, plasmid DNA or cosmid DNA expression vectors.

[0119] Certain examples of prokaryotic hosts are E. coli strain RRI, E.coli LE392, E. coli B, E. coli χ 1776 (ATCC No. 31537) as well as E.coli W3110 (F—, lambda-, prototrophic, ATCC No. 273325); bacilli such asBacillus subtilis; and other enterobacteriaceae such as Salmonellatyphimurium, Serratia marcescens, and various Pseudomonas species.

[0120] In general, plasmid vectors containing replicon and controlsequences that are derived from species compatible with the host cellare used in connection with these hosts. The vector ordinarily carries areplication site, as well as marking sequences that are capable ofproviding phenotypic selection in transformed cells. For example, E.coli is often transformed using pBR322, a plasmid derived from an E.coli species. Plasmid pBR322 contains genes for ampicillin andtetracycline resistance and thus provides easy means for identifyingtransformed cells. The pBR322 plasmid, or other microbial plasmid orphage must also contain, or be modified to contain, promoters that canbe used by the microbial organism for expression of its own proteins.

[0121] In addition, phage vectors containing replicon and controlsequences that are compatible with the host microorganism can be used astransforming vectors in connection with these hosts. For example, thephage lambda GEM™-11 may be utilized in making a recombinant phagevector that can be used to transform host cells, such as E. coli LE392.

[0122] Further useful vectors include pIN vectors (Inouye et al., 1985);and pGEX vectors, for use in generating glutathione S-transferase (GST)soluble fusion proteins for later purification and separation orcleavage. Other suitable fusion proteins are those with β-galactosidase,ubiquitin, or the like.

[0123] Promoters that are most commonly used in recombinant DNAconstruction include the lactamase (penicillinase), lactose andtryptophan (trp) promoter systems. While these are the most commonlyused, other microbial promoters have been discovered and utilized, anddetails concerning their nucleotide sequences have been published,enabling those of skill in the art to ligate them functionally withplasmid vectors.

[0124] For expression in Saccharomyces, the plasmid YRp7, for example,is commonly used (Stinchcomb et al., 1979; Kingsman et al., 1979;Tschemper et al., 1980). This plasmid contains the trpl gene, whichprovides a selection marker for a mutant strain of yeast lacking theability to grow in tryptophan, for example ATCC No. 44076 or PEP4-1(Jones, 1977). The presence of the trpl lesion as a characteristic ofthe yeast host cell genome then provides an effective environment fordetecting transformation by growth in the absence of tryptophan.

[0125] Suitable promoting sequences in yeast vectors include thepromoters for 3-phosphoglycerate kinase (Hitzeman et al., 1980) or otherglycolytic enzymes (Hess et al., 1968; Holland et al., 1978), such asenolase, glyceraldehyde-3-phosphate dehydrogenase, hexokinase, pyruvatedecarboxylase, phosphofructokinase, glucose-6-phosphate isomerase,3-phosphoglycerate mutase, pyruvate kinase, triosephosphate isomerase,phosphoglucose isomerase, and glucokinase. In constructing suitableexpression plasmids, the termination sequences associated with thesegenes are also ligated into the expression vector 3′ of the sequencedesired to be expressed to provide polyadenylation of the mRNA andtermination.

[0126] Other suitable promoters, which have the additional advantage oftranscription controlled by growth conditions, include the promoterregion for alcohol dehydrogenase 2, isocytochrome C, acid phosphatase,degradative enzymes associated with nitrogen metabolism, and theaforementioned glyceraldehyde-3-phosphate dehydrogenase, and enzymesresponsible for maltose and galactose utilization.

[0127] In addition to micro-organisms, cultures of cells derived frommulticellular organisms may also be used as hosts. In principle, anysuch cell culture is workable, whether from vertebrate or invertebrateculture. In addition to mammalian cells, these include insect cellsystems infected with recombinant virus expression vectors (e.g.,baculovirus); and plant cell systems infected with recombinant virusexpression vectors (e.g., cauliflower mosaic virus, CAMV; tobacco mosaicvirus, TMV) or transformed with recombinant plasmid expression vectors(e.g., Ti plasmid) containing one or more coding sequences.

[0128] In a useful insect system, Autograph californica nuclearpolyhidrosis virus (AcNPV) is used as a vector to express foreign genes.The virus grows in Spodoptera frugiperda cells. The isolated nucleicacid coding sequences are cloned into non-essential regions (for examplethe polyhedron gene) of the virus and placed under control of an AcNPVpromoter (for example, the polyhedron promoter). Successful insertion ofthe coding sequences results in the inactivation of the polyhedron geneand production of non-occluded recombinant virus (i.e., virus lackingthe proteinaceous coat coded for by the polyhedron gene). Theserecombinant viruses are then used to infect Spodoptera frugiperda cellsin which the inserted gene is expressed (e.g., U.S. Pat. No. 4,215,051).

[0129] Examples of useful mammalian host cell lines are VERO and HeLacells, Chinese hamster ovary (CHO) cell lines, WI38, BHK, COS-7, 293,HepG2, NIH3T3, RIN and MDCK cell lines. In addition, a host cell may bechosen that modulates the expression of the inserted sequences, ormodifies and processes the gene product in the specific fashion desired.Such modifications (e.g., glycosylation) and processing (e.g., cleavage)of protein products may be important for the function of the encodedprotein.

[0130] Different host cells have characteristic and specific mechanismsfor the post-translational processing and modification of proteins.Appropriate cell lines or host systems can be chosen to ensure thecorrect modification and processing of the foreign protein expressed.Expression vectors for use in mammalian cells ordinarily include anorigin of replication (as necessary), a promoter located in front of thegene to be expressed, along with any necessary ribosome binding sites,RNA splice sites, polyadenylation site, and transcriptional terminatorsequences. The origin of replication may be provided either byconstruction of the vector to include an exogenous origin, such as maybe derived from SV40 or other viral (e.g., Polyoma, Adeno, VSV, BPV)source, or may be provided by the host cell chromosomal replicationmechanism. If the vector is integrated into the host cell chromosome,the latter is often sufficient.

[0131] The promoters may be derived from the genome of mammalian cells(e.g., metallothionein promoter) or from mammalian viruses (e.g., theadenovirus late promoter; the vaccinia virus 7.5K promoter). Further, itis also possible, and may be desirable, to utilize promoter or controlsequences normally associated with the desired gene sequence, providedsuch control sequences are compatible with the host cell systems.

[0132] Specific initiation signals may also be required for efficienttranslation of the claimed isolated nucleic acid coding sequences. Thesesignals include the ATG initiation codon and adjacent sequences.Exogenous translational control signals, including the ATG initiationcodon, may additionally need to be provided. One of ordinary skill inthe art would readily be capable of determining this need and providingthe necessary signals. It is well known that the initiation codon mustbe in-frame (or in-phase) with the reading frame of the desired codingsequence to ensure translation of the entire insert. These exogenoustranslational control signals and initiation codons can be of a varietyof origins, both natural and synthetic. The efficiency of expression maybe enhanced by the inclusion of appropriate transcription enhancerelements or transcription terminators (Bittner et al., 1987).

[0133] In eukaryotic expression, one will also typically desire toincorporate into the transcriptional unit an appropriate polyadenylationsite (e.g., 5′-AATAAA-3′) if one was not contained within the originalcloned segment Typically, the poly A addition site is placed about 30 to2000 nucleotides “downstream” of the termination site of the protein ata position prior to transcription termination.

[0134] For long-term, high-yield production of recombinant proteins,stable expression is preferred. For example, cell lines that stablyexpress constructs encoding proteins may be engineered. Rather thanusing expression vectors that contain viral origins of replication, hostcells can be transformed with vectors controlled by appropriateexpression control elements (e.g., promoter, enhancer, sequences,transcription terminators, polyadenylation sites, etc.), and aselectable marker. Following the introduction of foreign DNA, engineeredcells may be allowed to grow for 1-2 days in an enriched medium, andthen are switched to a selective medium. The selectable marker in therecombinant plasmid confers resistance to the selection and allows cellsto stably integrate the plasmid into their chromosomes and grow to formfoci, which in turn can be cloned and expanded into cell lines.

[0135] It is contemplated that the nucleic acids of the invention may be“overexpressed”, i.e., expressed in increased levels relative to itsnatural expression in human cells, or even relative to the expression ofother proteins in the recombinant host cell. Such overexpression may beassessed by a variety of methods, including radio-labeling and/orprotein purification. However, simple and direct methods are preferred,for example, those involving SDS/PAGE and protein staining or westernblotting, followed by quantitative analyses, such as densitometricscanning of the resultant gel or blot. A specific increase in the levelof the recombinant protein or peptide in comparison to the level innatural human cells is indicative of overexpression, as is a relativeabundance of the specific protein in relation to the other proteinsproduced by the host cell and, e.g., visible on a gel.

[0136] II. Delivery In various embodiments of the invention, theexpression construct may comprise a virus or engineered constructderived from a viral genome. The ability of certain viruses to entercells via receptor-mediated endocytosis and to integrate into the hostcell genome and express viral genes stably and efficiently have madethem attractive candidates for the transfer of foreign genes intomammalian cells (Ridgeway, 1988; Nicolas and Rubenstein, 1988; Baichwaland Sugden, 1986; Temin, 1986). The first viruses used as vectors wereDNA viruses including the papovaviruses (simian virus 40, bovinepapilloma virus, and polyoma) (Ridgeway, 1988; Baichwal and Sugden,1986) and adenoviruses (Ridgeway, 1988; Baichwal and Sugden, 1986) andadeno-associated viruses. Retroviruses also are attractive gene transfervehicles (Nicolas and Rubenstein, 1988; Temin, 1986) as are vaccinavirus (Ridgeway, 1988) and adeno-associated virus (Ridgeway, 1988). Suchvectors may be used to (i) transform cell lines in vitro for the purposeof expressing proteins of interest or (ii) to transform cells in vitroor in vivo to provide therapeutic polypeptides in a gene therapyscenario. Herpes simplex virus (HSV) is another attractive candidate,especially where neurotropism is desired. HSV also is relatively easy tomanipulate and can be grown to high titers. Thus, delivery is less of aproblem, both in terms of volumes needed to attain sufficient MOI and ina lessened need for repeat dosings.

[0137] With the recent recognition of defective hepatitis B viruses, newinsight was gained into the structure-function relationship of differentviral sequences. In vitro studies showed that the virus could retain theability for helper-dependent packaging and reverse transcription despitethe deletion of up to 80% of its genome (Horwich et al., 1990). Thissuggested that large portions of the genome could be replaced withforeign genetic material. The hepatotropism and persistence(integration) were particularly attractive properties for liver-directedgene transfer. Chang et al., recently introduced the chloramphenicolacetyltransferase (CAT) gene into duck hepatitis B virus genome in theplace of the polymerase, surface, and pre-surface coding sequences. Itwas co-transfected with wild-type virus into an avian hepatoma cellline. Culture media containing high titers of the recombinant virus wereused to infect primary duckling hepatocytes. Stable CAT gene expressionwas detected for at least 24 days after transfection (Chang et al.,1991).

[0138] Several non-viral methods for the transfer of expressionconstructs into cultured mammalian cells also are contemplated by thepresent invention. These include calcium phosphate precipitation (Grahamand Van Der Eb, 1973; Chen and Okayama, 1987; Rippe et al., 1990)DEAE-dextran (Gopal, 1985), electroporation (Tur-Kaspa et al., 1986;Potter et al., 1984), direct microinjection (Harland and Weintraub,1985), DNA-loaded liposomes (Nicolau and Sene, 1982; Fraley et al.,1979) and lipofectamine-DNA complexes, cell sonication (Fechheimer etal., 1987), gene bombardment using high velocity microprojectiles (Yanget al., 1990), and receptor-mediated transfection (Wu and Wu, 1987; Wuand Wu, 1988). Some of these techniques may be successfully adapted forin vivo or ex vivo use.

[0139] Once the expression construct has been delivered into the cellthe nucleic acid encoding the gene of interest may be positioned andexpressed at different sites. In certain embodiments, the nucleic acidencoding the gene may be stably integrated into the genome of the cell.This integration may be in the cognate location and orientation viahomologous recombination (gene replacement) or it may be integrated in arandom, non-specific location (gene augmentation). In yet furtherembodiments, the nucleic acid may be stably maintained in the cell as aseparate, episomal segment of DNA. Such nucleic acid segments or“episomes” encode sequences sufficient to permit maintenance andreplication independent of or in synchronization with the host cellcycle. How the expression construct is delivered to a cell and where inthe cell the nucleic acid remains is dependent on the type of expressionconstruct employed.

[0140] In one embodiment, the expression construct may simply consist ofnaked recombinant DNA or plasmids. Transfer of the construct may beperformed by any of the methods mentioned above which physically orchemically permeabilize the cell membrane. This is particularlyapplicable for transfer in vitro but it may be applied to in vivo use aswell. Dubensky et al., (1984) successfully injected polyomavirus DNA inthe form of calcium phosphate precipitates into liver and spleen ofadult and newborn mice demonstrating active viral replication and acuteinfection. Benvenisty and Neshif (1986) also demonstrated that directintraperitoneal injection of calcium phosphate-precipitated plasmidsresults in expression of the transfected genes. It is envisioned thatDNA encoding a gene of interest may also be transferred in a similarmanner in vivo and express the gene product.

[0141] Another embodiment of the invention for transferring a naked DNAexpression construct or DNA segment into cells may involve particlebombardment. This method depends on the ability to accelerate DNA-coatedmicroprojectiles to a high velocity allowing them to pierce cellmembranes and enter cells without killing them (Klein et al., 1987).Several devices for accelerating small particles have been developed.One such device relies on a high voltage discharge to generate anelectrical current, which in turn provides the motive force (Yang etal., 1990). The microprojectiles used have consisted of biologicallyinert substances such as tungsten or gold beads.

[0142] Selected organs including the liver, skin, and muscle tissue ofrats and mice have been bombarded in vivo (Yang et al., 1990; Zelenin etal., 1991). This may require surgical exposure of the tissue or cells,to eliminate any intervening tissue between the gun and the targetorgan, i.e., ex vivo treatment. Again, DNA encoding a particular genemay be delivered via this method and still be incorporated by thepresent invention.

[0143] In a further embodiment of the invention, the DNA segment orexpression construct may be entrapped in a liposome. Liposomes arevesicular structures characterized by a phospholipid bilayer membraneand an inner aqueous medium. Multilamellar liposomes have multiple lipidlayers separated by aqueous medium. They form spontaneously whenphospholipids are suspended in an excess of aqueous solution. The lipidcomponents undergo self-rearrangement before the formation of closedstructures and entrap water and dissolved solutes between the lipidbilayers (Ghosh and Bachhawat, 1991). Also contemplated arelipofectamine-DNA complexes.

[0144] Liposome-mediated nucleic acid delivery and expression of DNA invitro has been very successful. Wong et al., (1980) demonstrated thefeasibility of liposome-mediated delivery and expression of foreign DNAin cultured chick embryo, HeLa and hepatoma cells. Nicolau et al.,(1987) accomplished successful liposome-mediated gene transfer in ratsafter intravenous injection.

[0145] In certain embodiments, the liposome may be complexed with ahemagglutinating virus (HVJ). This has been shown to facilitate fusionwith the cell membrane and promote cell entry of liposome-encapsulatedDNA (Kaneda et al., 1989). In other embodiments, the liposome may becomplexed or employed in conjunction with nuclear non-histonechromosomal proteins (HMG-1) (Kato et al., 1991). In yet furtherembodiments, the liposome may be complexed or employed in conjunctionwith both HVJ and HMG-1. In that such expression constructs have beensuccessfully employed in transfer and expression of nucleic acid invitro and in vivo, then they are applicable for the present invention.Where a bacterial promoter is employed in the DNA construct, it alsowill be desirable to include within the liposome an appropriatebacterial polymerase.

[0146] Other expression constructs which can be employed to deliver anucleic acid encoding a particular gene into cells are receptor-mediateddelivery vehicles. These take advantage of the selective uptake ofmacromolecules by receptor-mediated endocytosis in almost all eukaryoticcells. Because of the cell type-specific distribution of variousreceptors, the delivery can be highly specific (Wu and Wu, 1993).

[0147] Receptor-mediated gene targeting vehicles generally consist oftwo components: a cell receptor-specific ligand and a DNA-binding agent.Several ligands have been used for receptor-mediated gene transfer. Themost extensively characterized ligands are asialoorosomucoid (ASOR) (Wuand Wu, 1987) and transferrin (Wagner et al., 1990). Recently, asynthetic neoglycoprotein, which recognizes the same receptor as ASOR,has been used as a gene delivery vehicle (Ferkol et al., 1993; Peraleset al., 1994) and epidermal growth factor (EGF) has also been used todeliver genes to squamous carcinoma cells (Myers, EPO 0273085).

[0148] In other embodiments, the delivery vehicle may comprise a ligandand a liposome. For example, Nicolau et al., (1987) employedlactosyl-ceramide, a galactose-terminal asialganglioside, incorporatedinto liposomes and observed an increase in the uptake of the insulingene by hepatocytes. Thus, it is feasible that a nucleic acid encoding aparticular gene also may be specifically delivered into a cell type suchas lung, epithelial or tumor cells, by any number of receptor-ligandsystems with or without liposomes.

[0149] In certain embodiments, gene transfer may more easily beperformed under ex vivo conditions. Ex vivo gene therapy refers to theisolation of cells from an organism, the delivery of a nucleic acid intothe cells in vitro, and then the return of the modified cells back intoan organism. This may involve the surgical removal of tissue/organs froman animal or the primary culture of cells and tissues. Anderson et al.,U.S. Pat. No. 5,399,346, and incorporated herein in its entirety,disclose ex vivo therapeutic methods. F. Oligonucleotide SynthesisOligonucleotide synthesis is well known to those of skill in the art.Various different mechanisms of oligonucleotide synthesis have beendisclosed in for example, U.S. Pat. Nos. 4,659,774, 4,816,571,5,141,813, 5,264,566, 4,959,463, 5,428,148, 5,554,744, 5,574,146,5,602,244, each of which is incorporated herein by reference.

[0150] Phosphoramidite chemistry (Beaucage, and Lyer, 1992) has becomeby far the most widely used coupling chemistry for the synthesis ofoligonucleotides. As is well known to those skilled in the art,phosphoramidite synthesis of oligonucleotides involves activation ofnucleoside phosphoramidite monomer precursors by reaction with anactivating agent to form activated intermediates, followed by sequentialaddition of the activated intermediates to the growing oligonucleotidechain (generally anchored at one end to a suitable solid support) toform the oligonucleotide product.

[0151] Tetrazole is commonly used for the activation of the nucleosidephosphoramidite monomers. Tetrazole has an acidic proton whichpresumably protonates the basic nitrogen of the diisopropylaminophosphine group, thus making the diisopropylamino group a leaving group.The negatively charged tetrazolium ion then makes an attack on thetrivalent phosphorous, forming a transient phosphorous tetrazolidespecies. The 5′-OH group of the solid support bound nucleoside thenattacks the active trivalent phosphorous species, resulting in theformation of the internucleotide linkage. The trivalent phosphorous isfinally oxidized to the pentavalent phosphorous. The US patents listedabove describe other activators and solid supports for oligonucleotidesynthesis.

[0152] High throughput oligonucleotide synthesis can be achieved using asynthesizer. The Genome Science and Technology Center, as one aspect ofthe automation development effort, recently developed a high throughputlarge scale oligonucleotide synthesizer. This instrument, denoted theMERMADE, is based on a 96-well plate format and uses robotic control tocarry out parallel synthesis on 192 samples (296-well plates). Thisdevice has been variously described in the literature and inpresentations, is generally available in the public domain (licensedfrom the University of Texas and available on contract from Avantec).The device has gone through various generations with differing operatingparameters. The device may be used to synthesize 192 oligonucleotidessimultaneously with 99% success. It has virtually 100% success foroligomers less than 60 bp; operates at 20 mM synthesis levels, and givesa product yield of >99% complete synthesis. Using these systems theinventor has synthesized over 10,000 oligomers used for sequencing,PCRTM amplification and recombinant DNA applications. For most uses,including cloning, synthesis success is sufficient such that postsynthesis purification is not required.

[0153] Once the genome has been synthesized using the methods of thepresent invention it may be necessary to screen the sequences foranalysis of function. Specifically contemplated by the present inventorare chip-based DNA technologies such as those described by Hacia et al.(1996) and Shoemaker et al. (1996). Briefly, these techniques involvequantitative methods for analyzing large numbers of genes rapidly andaccurately. By tagging genes with oligonucleotides or using fixed probearrays, one can employ chip technology to segregate target molecules ashigh density arrays and screen these molecules on the basis ofhybridization. See also Pease et al. (1994); Fodor et al. (1991).

[0154] The use of combinatorial synthesis and high throughput screeningassays are well known to those of skill in the art, e.g. U.S. Pat. Nos.5,807,754; 5,807,683; 5,804,563; 5,789,162; 5,783,384; 5,770,358;5,759,779; 5,747,334; 5,686,242; 5,198,346; 5,738,996; 5,733,743;5,714,320; 5,663,046 (each specifically incorporated herein byreference). These patents teach various aspects of the methods andcompositions involved in the assembly and activity analyses of highdensity arrays of different polysubunits (polynucleotides orpolypeptides). As such it is contemplated that the methods andcompositions described in the patents listed above may be useful inassay the activity profiles of the compositions of the presentinvention.

[0155] The present invention produces a replication competentpolynucleotide. Viruses are naturally occurring replication competentpieces of DNA, to the extent that disclosure regarding viruses may beuseful in the context of the present invention, the following is adisclosure of viruses. Researchers note that viruses have evolved to beable to deliver their DNA to various host tissues despite the humanbody's various defensive mechanisms. For this reason, numerous viralvectors have been designed by researchers seeking to create vehicles fortherapeutic gene delivery. Some of the types of viruses that have beenengineered are listed below.

[0156] II. Adenovirus

[0157] Adenovirus is a 36 kB, linear, double-strained DNA virus thatallows substitution of large pieces of adenoviral DNA with foreignsequences up to 7 kB (Grunhaus and Horwitz, 1992). Adenovirus DNA doesnot integrate into the host cell chromosomal because adenoviral DNA canreplicate in an episomal manner. Also, adenoviruses are structurallystable, and no genome rearrangement has been detected after extensiveamplification. Adenovirus can infect virtually all epithelial cellsregardless of their cell cycle stage. This means that adenovirus caninfect non-dividing cells. So far, adenoviral infection appears to belinked only to mild disease such as acute respiratory disease in humans.This group of viruses can be obtained in high titers, e.g., 10⁹-10¹¹plaque-forming units per ml, and they are highly infective.

[0158] Both ends of the viral genome contain 100-200 base pair invertedrepeats (ITRs), which are cis elements necessary for viral DNAreplication and packaging. The early (E) and late (L) regions of thegenome contain different transcription units that are divided by theonset of viral DNA replication. The E1 region (E1A and E1B) encodesproteins responsible for the regulation of transcription of the viralgenome and a few cellular genes. The expression of the E2 region (E2Aand E2B) results in the synthesis of the proteins for viral DNAreplication. These proteins are involved in DNA replication, late geneexpression and host cell shut-off (Renan, 1990). The products of thelate genes, including the majority of the viral capsid proteins, areexpressed only after significant processing of a single primarytranscript issued by the major late promoter (MLP). The MLP, (located at16.8 m.u.) is particularly efficient during the late phase of infection,and all the mRNA's issued from this promoter possess a 5′-tripartiteleader (TPL) sequence which makes them preferred mRNA's for translation.

[0159] The E3 region encodes proteins that appears to be necessary forefficient lysis of Ad infected cells as well as preventing TNF-mediatedcytolysis and CTL mediated lysis of infected cells. In general, the E4region encodes is believed to encode seven proteins, some of whichactivate the E2 promoter. It has been shown to block host mRNA transportand enhance transport of viral RNA to cytoplasm. Further the E4 productis in part responsible for the decrease in early gene expression seenlate in infection. E4 also inhibits E1A and E4 (but not E1B), expressionduring lytic growth. Some E4 proteins are necessary for efficient DNAreplication however the mechanism for this involvement is unknown. E4 isalso involved in post-transcriptional events in viral late geneexpression; i.e., alternative splicing of the tripartite leader in lyticgrowth. Nevertheless, E4 functions are not absolutely required for DNAreplication but their lack will delay replication. Other functionsinclude negative regulation of viral DNA synthesis, induction ofsub-nuclear reorganization normally seen during adenovirus infection,and other functions that are necessary for viral replication, late viralmRNA accumulation, and host cell transcriptional shut off.

[0160] II. Retroviruses

[0161] The retroviruses are a group of single-stranded RNA virusescharacterized by an ability to convert their RNA to double-stranded DNAto infected cells by a process of reverse-transcription (Coffin, 1990).The resulting DNA then stably integrates into cellular chromosomes as aprovirus and directs synthesis of viral proteins. The integrationresults in the retention of the viral gene sequences in the recipientcell and its descendants. The retroviral genome contains three genes,gag, pol, and env that code for capsid proteins, polymerase enzyme, andenvelope components, respectively. A sequence found upstream from thegag gene, termed ψ components is constructed (Mann et al., 1983). When arecombinant plasmid containing a human cDNA, together with theretroviral LTR and ψ sequences is introduced into this cell line (bycalcium phosphate precipitation for example), the ψ sequence allows theRNA transcript of the recombinant plasmid to be packaged into viralparticles, which are then secreted into the culture media (Nicolas andRubenstein, 1988; Temin, 1986; Mann et al., 1983). The media containingthe recombinant retroviruses is then collected, optionally concentrated,and used for gene transfer. Retroviral vectors are able to infect abroad variety of cell types. However, integration requires the divisionof host cells (Paskind et al., 1975).

[0162] The retrovirus family includes the subfamilies of theoncoviruses, the lentiviruses and the spumaviruses. Two oncoviruses areMoloney murine leukemia virus (MMLV) and feline leukemia virus (FeLV).The lentiviruses include human immunodeficiency virus (IV), simianimmunodeficiency virus (SIV) and feline immunodeficiency virus (FIV).Among the murine viruses such as MMLV there is a further classification.Murine viruses may be ecotropic, xenotropic, polytropic or amphotropic.Each class of viruses target different cell surface receptors in orderto initiate infection.

[0163] Further advances in retroviral vector design and concentrationmethods have allowed production of amphotropic and xenotropic viruseswith titers of 10⁸ to 10⁹ cfu/ml (Bowles et al., 1996; Irwin et al.,1994; Jolly, 1994; Kitten et al., 1997).

[0164] Replication defective recombinant retroviruses are not acutepathogens in primates (Chowdhury et al., 1991). They have beensuccessfully applied in cell culture systems to transfer the CFTR geneand generate cAMP-activated Cl⁻ secretion in a variety of cell typesincluding human airway epithelia (Drumm et al., 1990, Olsen et al.,1992; Anderson et al., 1991; Olsen et al., 1993). While there isevidence of immune responses to the viral gag and env proteins, thisdoes not prevent successful readministration of vector (McCormack etal., 1997). Further, since recombinant retroviruses have no expressedgene products other than the transgene, the risk of a host inflammatoryresponse due to viral protein expression is limited (McCorrnack et al.,1997). As for the concern about insertional mutagenesis, to date thereare no examples of insertional mutagenesis arising from any human trialwith recombinant retroviral vectors.

[0165] More recently, hybrid lentivirus vectors have been describedcombining elements of human immunodeficiency virus (HIV) (Naldini etal., 1996) or feline immunodeficiency virus (FIV) (Poeschla et al.,1998) and MMLV. These vectors transduce nondividing cells in the CNS(Naldini et al, 1996; Blomer et al., 1997), liver (Kafri et al., 1997),muscle (Kafri et al., 1997) and retina (Miyoshi et al., 1997). However,a recent report in xenograft models of human airway epithelia suggeststhat in well-differentiated epithelia, gene transfer with VSV-Gpseudotyped HIV-based lentivirus is inefficient (Goldman et al., 1997).

[0166] III. Adeno-Associated Virus

[0167] In addition, AAV possesses several unique features that make itmore desirable than the other vectors. Unlike retroviruses, AAV caninfect non-dividing cells; wild-type AAV has been characterized byintegration, in a site-specific manner, into chromosome 19 of humancells (Kotin and Berns, 1989; Kotin et al., 1990; Kotin et al., 1991;Samulski et al., 1991); and AAV also possesses anti-oncogenic properties(Ostrove et al., 1981; Berns and Giraud, 1996). Recombinant AAV genomesare constructed by molecularly cloning DNA sequences of interest betweenthe AAV ITRs, eliminating the entire coding sequences of the wild-typeAAV genome. The AAV vectors thus produced lack any of the codingsequences of wild-type AAV, yet retain the property of stablechromosomal integration and expression of the recombinant genes upontransduction both in vitro and in vivo (Berns, 1990; Berns and Bohensky,1987; Bertran et al., 1996; Kearns et al., 1996; Ponnazhagan et al.,1997a). Until recently, AAV was believed to infect almost all celltypes, and even cross species barriers. However, it now has beendetermined that AAV infection is receptor-mediated (Ponnazhagan et al.,1996; Mizukami et al., 1996).

[0168] AAV utilizes a linear, single-stranded DNA of about 4700 basepairs. Inverted terminal repeats flank the genome. Two genes are presentwithin the genome, giving rise to a number of distinct gene products.The first, the cap gene, produces three different virion proteins (VP),designated VP-1, VP-2 and VP-3. The second, the rep gene, encodes fournon-structural proteins (NS). One or more of these rep gene products isresponsible for transactivating AAV transcription. The sequence of AAVis provided by Srivastava et al. (1983), and in U.S. Pat. No. 5,252,479(entire text of which is specifically incorporated herein by reference).

[0169] The three promoters in AAV are designated by their location, inmap units, in the genome. These are, from left to right, p5, p19 andp40. Transcription gives rise to six transcripts, two initiated at eachof three promoters, with one of each pair being spliced. The splicesite, derived from map units 42-46, is the same for each transcript. Thefour non-structural proteins apparently are derived from the longer ofthe transcripts, and three virion proteins all arise from the smallesttranscript.

[0170] AAV is not associated with any pathologic state in humans.Interestingly, for efficient replication, AAV requires “helping”functions from viruses such as herpes simplex virus I and II,cytomegalovirus, pseudorabies virus and, of course, adenovirus. The bestcharacterized of the helpers is adenovirus, and many “early” functionsfor this virus have been shown to assist with AAV replication. Low levelexpression of AAV rep proteins is believed to hold AAV structuralexpression in check, and helper virus infection is thought to removethis block.

[0171] IV. Vaccinia Virus

[0172] Vaccinia viruses are a genus of the poxyirus family. Vacciniavirus vectors have been used extensively because of the ease of theirconstruction, relatively high levels of expression obtained, wide hostrange and large capacity for carrying DNA. Vaccinia contains a linear,double-stranded DNA genome of about 186 kB that exhibits a marked “A-T”preference. Inverted terminal repeats of about 10.5 kB flank the genome.The majority of essential genes appear to map within the central region,which is most highly conserved among poxyiruses. Estimated open readingframes in vaccinia virus number from 150 to 200. Although both strandsare coding, extensive overlap of reading frames is not common. U.S. Pat.No. 5,656,465 (specifically incorporated by reference) describes in vivogene delivery using pox viruses.

[0173] V. Papovavirus

[0174] The papovavirus family includes the papillomaviruses and thepolyomaviruses. The polyomaviruses include Simian Virus 40 (SV40),polyoma virus and the human polyomaviruses BKV and JCV. Papillomavirusesinclude the bovine and human papillomaviruses. The genomes ofpolyomaviruses are circular DNAs of a little more than 5000 bases. Thepredominant gene products are three virion proteins (VP1-3) and Large Tand Small T antigens. Some have an additional structural protein, theagnoprotein, and others have a Middle T antigen. Papillomaviruses aresomewhat larger, approaching 8 kB Little is known about the cellularreceptors for polyomaviruses, but polyoma infection can be blocked bytreating with sialidase. SV40 will still infect sialidase-treated cells,but JCV cannot hemagglutinate cells treated with sialidase. Becauseinteraction of polyoma V1 with the cell surface activates c-myc andc-fos, it has been hypothesized that the virus receptor may have someproperties of a growth factor receptor. Papillomaviruses arespecifically tropic for squamous epithelia, though the specific receptorhas not been identified.

[0175] VI. Paramyxovirus

[0176] The paramyxovirus family is divided into three genera:paramyxovirus, morbillivirus and pneumovirus. The paramyxovirus genusincludes the mumps virus and Sendai virus, among others, while themorbilliviruses include the measles virus and the pneumoviruses includerespiratory syncytial virus (RSV). Paramyxovirus genomes are RNA basedand contain a set of six or more genes, covalently linked in tandem. Thegenome is something over 15 kB in length. The viral particle is 150-250nm in diameter, with “fuzzy” projections or spikes protruding therefrom.These are viral glycoproteins that help mediate attachment and entry ofthe virus into host cells.

[0177] A specialized series of proteins are involved in the binding anentry of paramyxoviruses. Attachment in Paramyxoviruses andMorbilliviruses is mediated by glycoproteins that bind to sialicacid-containing receptors. Other proteins anchor the virus by embeddinghydrophobic regions in the lipid bilayer of the cell's surface, andexhibit hemagluttinating and neuraminidase activities. In Pnemoviruses,the glycoproptein is heavily glycosylated with O-glycosidic bonds. Thismolecule lacks the exhibit hemagluttinating and neurarninidaseactivities of its relatives.

[0178] VII. Herpesvirus.

[0179] Because herpes simplex virus (HSV) is neurotropic, it hasgenerated considerable interest in treating nervous system disorders.Moreover, the ability of HSV to establish latent infections innon-dividing neuronal cells without integrating in to the host cellchromosome or otherwise altering the host cell's metabolism, along withthe existence of a promoter that is active during latency makes HSV anattractive vector. And though much attention has focused on theneurotropic applications of HSV, this vector also can be exploited forother tissues given its wide host range.

[0180] Another factor that makes HSV an attractive vector is the sizeand organization of the genome. Because HSV is large, incorporation ofmultiple genes or expression cassettes is less problematic than in othersmaller viral systems. In addition, the availability of different viralcontrol sequences with varying performance (temporal, strength, etc.)makes it possible to control expression to a greater extent than inother systems. It also is an advantage that the virus has relatively fewspliced messages, further easing genetic manipulations.

[0181] HSV also is relatively easy to manipulate and can be grown tohigh titers. Thus, delivery is less of a problem, both in terms ofvolumes needed to attain sufficient MOI and in a lessened need forrepeat dosings. For a review of HSV as a gene therapy vector, seeGlorioso et al. (1995).

[0182] HSV, designated with subtypes 1 and 2, are enveloped viruses thatare among the most common infectious agents encountered by humans,infecting millions of human subjects worldwide. The large, complex,double-stranded DNA genome encodes for dozens of different geneproducts, some of which derive from spliced transcripts. In addition tovirion and envelope structural components, the virus encodes numerousother proteins including a protease, a ribonucleotides reductase, a DNApolymerase, a ssDNA binding protein, a helicase/primase, a DNA dependentATPase, a dUTPase and others.

[0183] HSV genes form several groups whose expression is coordinatelyregulated and sequentially ordered in a cascade fashion (Honess andRoizman, 1974; Honess and Roizman 1975; Roizman and Sears, 0.1995). Theexpression of α genes, the first set of genes to be expressed afterinfection, is enhanced by the virion protein number 16, or α-transducingfactor (Post et al., 1981; Batterson and Roizman, 1983; Campbell et al.,1983). The expression of β genes requires functional cc gene products,most notably ICP4, which is encoded by the α4 gene (DeLuca et al.,1985). γ genes, a heterogeneous group of genes encoding largely virionstructural proteins, require the onset of viral DNA synthesis foroptimal expression (Holland et al., 1980).

[0184] In line with the complexity of the genome, the life cycle of HSVis quite involved. In addition to the lytic cycle, which results insynthesis of virus particles and, eventually, cell death, the virus hasthe capability to enter a latent state in which the genome is maintainedin neural ganglia until some as of yet undefined signal triggers arecurrence of the lytic cycle. Avirulent variants of HSV have beendeveloped and are readily available for use in gene therapy contexts(U.S. Pat. No. 5,672,344).

G. EXAMPLES

[0185] The following examples are included to demonstrate preferredembodiments of the invention. It should be appreciated by those of skillin the art that the techniques disclosed in the examples which followrepresent techniques discovered by the inventor to function well in thepractice of the invention, and thus can be considered to constitutepreferred modes for its practice. However, those of skill in the artshould, in light of the present disclosure, appreciate that many changescan be made in the specific embodiments which are disclosed and stillobtain a like or similar result without departing from the spirit andscope of the invention.

Example 1 Combinatoric gene assembly

[0186] The inventor has developed a strategy of oligomer assembly intolarger DNA molecules denoted combinatoric assembly. The procedure iscarried out as follows: one may design a plasmid using one of a numberof commercial or public domain computer programs to contain the genes,promoters, drug selection, origin of replication, etc. required. SynGenev.2.0 is a program that generates a list of overlapping oligonucleotidessufficient to reassemble the gene or plasmid (see FIG. 7). For instance,for a 5000 bp gene, SynGene 2.0 can generate two lists of 100 component50 mers from one strand and 100 component 50 mers from the complementarystrand such that each pair of oligomers will overlap by 25 base pairs.The program checks the sequence for repeats and produces a MERMADE inputfile which directly programs the oligonucleotide synthesizer. Thesynthesizer produces two sets of 96-well plates containing thecomplementary oligonucleotides. A SynGene program is depicted in FIG. 7.This program is designed to break down a designer gene or genome intooligonucleotides fore synthesis. The program is for the completesynthetic designer gene and is based upon an original program forformatting DNA sequences written by Dr. Glen Evans.

[0187] Combinatoric assembly is best carried out using a programmablerobotic workstation such as a Beckman Biomek 2000. In short, pairs ofoligomers which overlap are mixed and annealed. Following annealing, asmaller set of duplex oligomers is generated. These are again paired andannealed, forming a smaller set of larger oligomers. Sequentially,overlapping oligomers are allowed to anneal until the entire reassemblyis completed. Annealing may be carried out in the absence of ligase, oreach step may be followed by ligation. In one configuration, oligomersare annealed in the presence of topoisomerase 2, which does not require5′ phosphorylation of the oligomer, occurs at room temperature, and is arapid (5 minute) reaction as opposed to 12 h ligation at 12°. Followingthe complete assembly, the resulting DNA molecule can be used for itsdesigned purpose, usually transformation into a bacterial host forreplication. The steps in this cycle are outlined in FIG. 3.

[0188] This approach has a major advantage over traditional recombinantDNA based cloning. While it is technically feasible to make virtuallyany modification or mutation in existing DNA molecules, the effortrequired, as will as the high technical skill, make some constructionsdifficult or tedious. This method, while having been used for manyyears, is not applicable to automated gene cloning or large scalecreation or entirely novel DNA sequences.

Example 2 Production of Artificial Genes

[0189] In one example, the present invention will produce a known geneof about 1000 base pairs in length by the following method. A set ofoligonucleotides, each of 50 bases, is generated such that the entireplus strand of the gene is represented. A second set ofoligonucleotides, also comprised of 50-mers, is generated for the minusstrand. This set is designed, however, such that complementary pairingwith the first and second sets results in overlap of “paired” sequences,i.e., each oligonucleotide of the first set is complementary withregions from two oligonucleotides of the second set (with the possibleexception of the terminal oligonucleotides). The region of overlap isset at 30 bases, leaving a 20 base pair overhang for each pair. Thefirst and said second set of oligonucleotides is annealed in a singlemixture and treated with a ligating enzyme.

[0190] In another example, the gene to be synthesized is about 5000 basepairs. Each set of oligonucleotides is made up of fifty 100-mers withoverlapping regions, of complementary oligonucleotides, of 75 bases,leaving 25 base “sticky ends.” In this embodiment, the 5′ terminaloligonucleotide of the first oligonucleotide set is annealed with the 3′terminal oligonucleotide of the second set to form a first annealedproduct, then the next most 5′ terminal oligonucleotide of the first setis annealed with the first annealed product to form a second annealedproduct, and the process is repeated until all oligonucleotides of saidfirst and said second sets have been annealed. Ligation of the productsmay occur between steps or at the conclusion of all hybridizations.

[0191] In a third example, a gene of 100,000 bp is synthesize from onethousand 100-mers. Again, the overlap between “pairs” of plus and minusoligonucleotides is 75 bases, leaving a 25 base pair overhang. In thismethod, a combinatorial approach is used where corresponding pairs ofpartially complementary oligonucleotides are hybridized in first step. Asecond round of hybridization then is undertaken with appropriatelycomplementary pairs of products from the first round. This process isrepeated a total of 10 times, each round of hybridization reducing thenumber of products by half. Ligation of the products then is performed.

Example 3 Large Scale Expression of Human Gene Products

[0192] Once the human genome has been characterized, functional analysisof the human genome, based upon the complete sequence, will require avariety of approaches to structural, functional and network biology. Theapproach proposed herein for producing a series of expression constructsrepresenting all potential human gene products and the assembly of setsof bacterial and/or yeast expressing these products will provide animportant avenue into the beginnings of functional analysis.

[0193] Secondly, the approach described here, when developed to itstheoretical optima, will allow the large scale transfer of genes to celllines or organisms for functional analysis. The long term goal of thisconcept is the creation of living organisms entirely based onbioinformatics and information processing. Obviously, the knowledge ofthe complete sequence is not sufficient to appreciate the myriad ofbiological concepts inherent in life.

Example 4 Construction of a Synthetic Plasmid

[0194] A DNA molecule was designed using synthetic parts of previouslyknown plasmids. As a demonstration of this technique, plasmid synlux⁴was designed. Synlux4 consists of 4800 base pairs of DNA. Within thissequence are included the sequence of lux A and lux B, the A and Bcomponents of the luciferase protein from Vibrio Fisherii, potions ofplasmid. pUC19 including the origin of replication and replicationstability sequences, the promoter and coding sequence for tn9kanamycin/neomycin phosphotransferase. The sequence was designed on acomputer using Microsoft Word and Vector NTI (InforMax, Inc.). Thesequence is listed in FIG. 4.

[0195] Following design, a computer program SynGene 2.0 was used tobreak the sequence down into components consisting of overlapping 50-meroligonucleotides. From the 4800 base pair sequence, 192 50-mers weredesigned. The component oligonucleotides are listed in FIG. 5. Thesecomponent oligonucleotides were synthesized using a custom 96-welloligonucleotide synthesizer (Rayner, et al.) Genome Research, 8, 741-747(1998). The component oligonucleotides were produced in two 96-wellmicrotitre plates, each plate holding one set of componentoligonucleotides. Thus, plate one held the forward strand oligos andplate 2 held the reverse strand oligos.

[0196] The oligonucleotides were assembled and ligations carried outusing a Biomek 1000 robotic workstation (Beckman). Sequential transfersof oligonucleotides were done by pipetting from one well to a secondwell of the plate and a ligation reaction carried out using T4 ligase.The pattern of assembly is delineated in FIG. 6.

[0197] Following assembly, the resulting ligation mix was used totransform competent E. coli strain DH5a. The transformation mix wasplated on LB plates containing 25 μg/ml kanamycin sulfate andrecombinant colonies obtained. The resulting recombinant clones wereisolated, cloned, and DNA prepared. The DNA was analyzed on 1% agarosegels in order detect recombinant molecules. Clones were shown to containthe expected 4800 base pair plasmid containing lux A and B genes.

[0198] All of the compositions and/or methods disclosed and claimedherein can be made and executed without undue experimentation in lightof the present disclosure. While the compQsitions and methods of thisinvention have been described in terms of preferred embodiments, it willbe apparent to those of skill in the art that variations may be appliedto the compositions and/or methods and in the steps or in the sequenceof steps of the method described herein without departing from theconcept, spirit and scope of the invention. More specifically, it willbe apparent that certain agents which are both chemically andphysiologically related may be substituted for the agents describedherein while the same or similar results would be achieved. All suchsimilar substitutes and modifications apparent to those skilled in theart are deemed to be within the spirit, scope and concept of theinvention as defined by the appended claims.

REFERENCES

[0199] The following references, to the extent that they provideexemplary procedural or other details supplementary to those set forthherein, are specifically incorporated herein by reference.

[0200] Anderson et al. U.S. Pat. No. 5,399,346, 1995.

[0201] Anderson et al., Science, 253:202-205, 1991. Baichwal and Sugden,“Vectors for gene transfer derived from animal DNA viruses: Transientand stable expression of transferred genes,” In: Gene Transfer,Kucherlapati (ed.), New York, Plenum Press, pp. 117-148, 1-986.

[0202] Batterson and Roizman, J. Virol., 46:371-377,1983.

[0203] Beaucage, and Lyer, Tetrahedron, 48:2223-2311, 1992

[0204] Benvenisty and Neshif, “Direction introduction of genes into ratsand expression of the genes,” Proc. Nat. Acad. Sci. USA, 83:9551-9555,1986.

[0205] Berns and Bohenzky, Adv. Virus Res., 32:243-307, 1987.

[0206] Bertran, et al., J Virol., 70(10):6759-6766, 1996.

[0207] Bittner et al., Methods in Enzymol, 153:516-544, 1987.

[0208] Blomer et al., Highly efficient and sustained gene transfer inadult neurons with a lentivirus vector. J. Virol. 71:6641-6649, 1997

[0209] Bowles et al., Hum. Gene Ther., 7:1735-1742, 1996.

[0210] Chang et al., “Foreign gene delivery and expression inhepatocytes using a hepatitis B virus vector,” Hepatology, 14:124A,1991.

[0211] Chen and Okayama, “High-efficiency transfection of mammaliancells by plasmid DNA,” Mol. Cell Biol., 7:2745-2752, 1987.

[0212] Coffin, “Retroviridae and their replication,” In: Virology,Fields, Knipe (ed.), New York: Raven Press, pp. 1437-1500,1990.

[0213] Coffin, In: Virology, ed., New York: Raven Press, pp. 1437-1500,1990.

[0214] Colberre-Garapin et al., J. Mol. Biol., 150:1, 1981.

[0215] Couch et al, “Immunization with types 4 and 7 adenovirus byselective infection of the intestinal tract,” Am. Rev. Resp. Dis.,88:394-403, 1963.

[0216] Coupar et al., “A general method for the construction ofrecombinant vaccinia virus expressing multiple foreign genes,” Gene,68:1-10, 1988.

[0217] Davey et al., EPO No. 329 822.

[0218] DeLuca et al., J. Virol., 56:558-570, 1985.

[0219] Dnummetal., Cell, 62:1227-1233, 1990.

[0220] Dubensky et al., “Direct transfection of viral and plasmid DNAinto the liver or spleen of mice,” Proc. Nat. Acad. Sci. USA,81:7529-7533, 1984.

[0221] Fechheimer et al., “Transfection of mammalian cells with plasmidDNA by scrape loading and sonication loading,” Proc. Natl. Acad. SciUSA, 84:8463-8467, 1987.

[0222] Ferkol et al., “Regulation of the phosphoenolpyruvatecarboxykinase/human factor IX gene introduced into the livers of adultrats by receptor-mediated gene transfer,” FASEB J., 7:1081-1091,1993.

[0223] Fraley et al., “Entrapment of a bacterial plasmid in phospholipidvesicles: Potential for gene transfer,” Proc. Natl. Acad. Sci. USA,76:3348-3352, 1979.

[0224] Freifelder, Physical Biochemistry Applications to Biochemistryand Molecular Biology, 2nd ed. Wm. Freeman and Co., New York, N.Y.,1982.

[0225] Friedmann, “Progress toward human gene therapy,” Science,244:1275-1281, 1989.

[0226] Frohman, In: PCR™ Protocols: A Guide To Methods And Applications,Academic Press, N.Y., 1990.

[0227] Ghosh-Choudhury et al, EMBO J, 6:1733-1739, 1987.

[0228] Ghosh and Bachhawat, “Targeting of liposomes to hepatocytes,” In:Liver diseases, targeted diagnosis and therapy using specific receptorsand ligands, Wu, Wu (ed.), New York:, Marcel Dekker, pp. 87-104, 1991.

[0229] Gingeras et al., PCT Application WO 88/10315.

[0230] Glorioso et al., Ann. Rev. Microbiol, 49:675-710, 1995.

[0231] Goldman et al., Hum Gene Ther. 8(18): 2261-2268, 1997.

[0232] Gomez-Foix et al., “Adenovirus-mediated transfer of the muscleglycogen phosphorylase gene into hepatocytes confers altered regulationof glycogen,” J. Biol. Chem., 267:25129-25134, 1992.

[0233] Gopal, “Gene transfer method for transient gene expression,stable transfection, and cotransfection of suspension cell cultures,”Mol Cell Biol, 5:1188-1190, 1985.

[0234] Graham and Prevec, “Adenovirus-based expression vectors andrecombinant vaccines,” Biotech., 20:363-390, 1992.

[0235] Graham and Prevec, “Manipulation of adenovirus vector,” In:Methods in Molecular Biology: Gene Transfer and Expression Protocol,Clifton and Murray (ed.), NJ, Humana Press, 7:109-128,1991.

[0236] Graham and Van Der Eb, “A new technique for the assay ofinfectivity of human adenovirus 5 DNA,” Virology, 52:456-467, 1973.

[0237] Graham et al., “Characteristics of a human cell line transformedby DNA from human adenovirus type 5,” J. Gen. Virol., 36:59-72,1977.

[0238] Grunhaus and Horwitz, “Adenovirus as cloning vector,” Seminar inVirology, 3:237-252, 1992.

[0239] Harland and Weintraub, “Translation of mammalian mRNA injectedinto Xenopus oocytes is specifically inhibited by antisense RNA,” J.Cell Biol., 101:1094-1099, 1985.

[0240] Hermonat and Muzycska, “Use of adenoassociated virus as amammalian DNA cloring vector: Transduction of neomycin resistance intomammalian tissue culture cells,” Proc. Nat. Acad. Sci. USA,81:6466-6470,1984.

[0241] Hersdorffer et al., “Efficient gene transfer in live mice using aunique retroviral packaging line,” DNA Cell Biol., 9:713-723, 1990.

[0242] Herz and Gerard, “Adenovirus-mediated transfer of low densitylipoprotein receptor gene acutely accelerates cholesterol clearance innormal mice,” Proc. Natl. Acad. Sci. USA 90:2812-2816, 1993.

[0243] Hess et al, J. Adv. Enzyme Reg., 7:149, 1968.

[0244] Hitzeman et al., J. Biol. Chem., 255:2073, 1980.

[0245] Holland et al., Biochemistry, 17:4900, 1978.

[0246] Holland et al., Virology, 101:10-18, 1980.

[0247] Honess and Roizman, J Virol., 14:8-19,1974.

[0248] Honess and Roizman, J Virol., 16:1308-1326, 1975

[0249] Horwich et al., “Synthesis of hepadenovirus particles thatcontain replication-defective duck hepatitis B virus genomes in culturedHuH7 cells,” J. Virol., 64:642-650, 1990.

[0250] Innis et al., PCR™ Protocols, Academic Press, Inc., San DiegoCalif., 1990.

[0251] Inouye et al., Nucleic Acids Res., 13: 3101-3109, 1985.

[0252] Irwin et al., J. Virol., 68:5036-5044, 1994.

[0253] Johnson et al., “Peptide Turn Mimetics” IN: Biotechnology AndPharmacy, Pezzuto et al., (eds.), Chapman and Hall, New York, 1993.

[0254] Jolly, “Viral vector systems for gene therapy,” Can. Gene Ther.,1:51-64, 1994.

[0255] Jones and Shenk, “Isolation of deletion and substitution mutantsof adenovirus type 5,” Cell, 13:181-188, 1978.

[0256] Jones, Genetics, 85: 12, 1977.

[0257] Kafri et al., Sustained expression of genes delivered directlyinto liver and muscle by lentiviral vectors. Nat. Genet. 17:314-317,1997.

[0258] Kaneda et al., “Increased expression of DNA cointroduced withnuclear protein in adult rat liver,” Science, 243:375-378,1989.

[0259] Karlsson et al, EMBO J, 5:2377-2385, 1986.

[0260] Kato et al, “Expression of hepatitis β virus surface antigen inadult rat liver,” J Biol. Chem., 266:3361-3364, 1991.

[0261] Kearns et al., Gene Ther., 3:748-755, 1996.

[0262] Kingsman et al., Gene, 7: 141, 1979.

[0263] Kitten et al. Hum. Gene Ther., 8:1491-1494, 1997.

[0264] Klein et al., “High-velocity microprojectiles for deliveringnucleic acids into living cells,” Nature, 327:70-73,1987.

[0265] Kotin and Berns, Virol., 170:460-467, 1989.

[0266] Kotin et al., Genomics, 10:831-834, 1991.

[0267] Kotin et al, Proc. Natl. Acad. Sci. USA, 87:2211-2215, 1990.

[0268] Kwoh et al., Proc. Nat. Acad. Sci. USA, 86:1173, 1989.

[0269] Kyte and Doolittle, “A simple method for displaying thehydropathic character of a protein,” J. Mol. Biol., 157(1):105-132,1982.

[0270] Le Gal La Salle et al., “An adenovirus vector for gene transferinto neurons and glia in the brain,” Science, 259:988-990, 1993.

[0271] Levrero et al, Gene, 101:195-202, 1991.

[0272] Lishanski et al., Proc. Nat'l. Acad. Sci USA., 91:2674-2678,(1994) Lowy et al., Cell, 22: 817, 1980.

[0273] Macejak and Sarnow, Nature, 353:90-94, 1991.

[0274] Mann et al., “Construction of a retrovirus packaging mutant andits use to produce helper-free defective retrovirus,” Cell,33:153-159,1983.

[0275] Mann et al., Cell, 33:153-159, 1983.

[0276] Markowitz et al., “A safe packaging line for gene transfer:Separating viral genes on two different plasmids,” J. Virol.,62:1120-1124,1988.

[0277] McCormack et al. Hum. Gene Ther. 8: 1263-1273, 1997.

[0278] Miller et al., PCT Application WO 89/06700

[0279] Miyoshi et al., Proc. Natl. Acad. Sci. USA 94:10319-10323.

[0280] Mizukami et al., Virology, 217:124-130, 1996.

[0281] Mulligan et al, Proc. Nat'l Acad. Sci. USA, 78: 2072, 1981.

[0282] Mulligan, “The basic science of gene therapy,” Science,260:926-932, 1993.

[0283] Myers, EP 0273085

[0284] Naldini et al. Science 272:263-267.

[0285] Nicolas and Rubenstein, “Retroviral vectors,” In: Vectors: Asurvey of molecular cloning vectors and their uses, Rodriguez andDenhardt (eds.), Stoneham: Butterworth, pp. 493-513, 1988.

[0286] Nicolas and Rubenstein, In: Vectors: A survey of molecularcloning vectors and their uses, Rodriguez and Denhardt (eds.), Stoneham:Butterworth, pp. 493-513, 1988.

[0287] Nicolau and Sene, “Liposome-mediated DNA transfer in eukaryoticcells,” Biochim. Biophys. Acta, 721:185-190, 1982.

[0288] Nicolau et al., “Liposomes as carriers for in vivo gene transferand expression,” Methods Enzymol., 149:157-176, 1987.

[0289] O'Hare et al., Proc. Nat'l Acad. Sci. USA, 78: 1527, 1981.

[0290] Ohara et al., Proc. Nat'l Acad. Sci. USA, 86: 5673-5677, 1989.

[0291] Olsen, J. C., L. G. Johnson, J. M. Stutts, B. Sarkadi, J. R.Yankaskas, R. Swanstrom, and R. C. Boucher. 1992. Correction of theapical membrane chloride permeability defect in polarized cysticfibrosis airway epithelia following retroviral-mediated gene transfer.Hum. Gene Ther. 3:253-266.

[0292] Olsen, Johnson, Wong-Sun, Moore, Swanstrom, and Boucher,“Retrovirus-mediated gene transfer to cystic fibrosis airway epithelialcells: effect of selectable marker sequences on long-term expression,”Nucleic Acids Res., 21(3):663-669, 1993.

[0293] Ostrove et al., Virology, 113:532-533, 1981.

[0294] Paskind et al., “Dependence of moloney murine leukemia virusproduction on cell growth,” Virology, 67:242-248,1975.

[0295] Paskind et al., Virology, 67:242-248,1975.

[0296] Pelletier and Sonenberg, Nature, 334:320-325, 1988.

[0297] Perales et al., “Gene transfer in vivo: Sustained expression andregulation of genes introduced into the liver by receptor-targeteduptake,” Proc. Natl. Acad Sci. 91:4086-4090,1994.

[0298] Pignon et al., Hum. Mutat., 3: 126-132; 1994.

[0299] Poeschla, E. M., F. W-Staal, and D. L. Looney. 1998. Efficienttransduction of non-dividing hunan cells by feline immunodeficiencyvirus lentiviral vectors. Nature Med. 4:354-357.

[0300] Ponnazhagan et al., Hum. Gene Ther., 8:275-284, 1997a.

[0301] Ponnazhagan et al., J. Gen. Virol., 77:1111-1122, 1996.

[0302] Post et al., Cell, 24:555-565,1981.

[0303] Potter et al., “Enhancer-dependent expression of human kimmunoglobulin genes introduced into mouse pre-B lymphocytes byelectroporation,” Proc. Nat. Acad. Sci. USA, 81:7161-7165, 1984.

[0304] Racher et al., Biotechnology Techniques, 9:169-174, 1995.

[0305] Ragot et al.; “Efficient adenovirus-mediated transfer of a humanminidystrophin-gene to skeletal muscle of mdx mice,” Nature,361:647-650, 1993.

[0306] Renan, “Cancer genes: Current status, future prospects andapplications in radiotherapy/oncology,” Radiother. Oncol, 19:197-218,1990.

[0307] Renan, Radiother. Oncol., 19:197-218, 1990.

[0308] Rich et al., “Development and analysis of recombinantadenoviruses for gene therapy of cystic fibrosis,” Hum. Gene Ther.,4:461-476, 1993.

[0309] Ridgeway, “Mammalian expression vectors,” In: Rodriguez R L,Denhardt D T, ed. Vectors: A survey of molecular cloning vectors andtheir uses. Stoneham: Butterworth, pp. 467-492, 1988.

[0310] Rippe et al., “DNA-mediated gene transfer into adult rathepatocytes in primary culture,” Mol Cell Biol., 10:689-695, 1990.

[0311] Roizman and Sears, In Fields' Virology, 3rd Edition, eds. Fieldset al. (Raven Press, New York, N.Y.), pp.2231-2295,1995.

[0312] Rosenfeld et al., “In vivo transfer of the human cystic fibrosistransmembrane conductance regulator gene to the airway epithelium,”Cell, 68:143-155,1992.

[0313] Roux et al., “A versatile and potentially general approach to thetargeting of specific cell types by retroviruses: Application to theinfection of human cells by means of major histocompatibility complexclass I and class II antigens by mouse ecotropic murine leukemiavirus-derived viruses,” Proc. Natl Acad. Sci. USA, 86:9079-9083,1989.

[0314] Sambrook et al., Molecular cloning: A laboratory manual, 2d Ed.,Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989.

[0315] Samulski et al, EMBO J., 10:3941-3950, 1991.

[0316] Srivastava et al., J. Virol., 45:555-564, 1983.

[0317] Stinchcomb et al., Nature, 282: 39, 1979.

[0318] Stratford-Perricaudet and Perricaudet, “Gene transfer intoanimals: the promise of adenovirus,” p. 51-61, In: HUMAN GENE TRANSFER,Cohen-Haguenauer and Boiron (eds.), Editions John Libbey Eurotext,France, 1991.

[0319] Stratford-Perricaudet et al., “Evaluation of the transfer andexpression in mice of an enzyme-encoding gene using a human adenovirusvector,” Hum. Gene. Ther., 1:241-256, 1990.

[0320] Summers et al. “A manual of methods for baculovirus vectors andinsect cell culture procedures,” Texas Agriculture Experimental Station.

[0321] Szybalska et al., Proc. Nat'l Acad. Sci. USA, 48:2026, 1962.

[0322] Temin, “Retrovirus vectors for gene transfer: Efficientintegration into and expression of exogenous DNA in vertebrate cellgenome,” In: Gene Transfer, Kucherlapati (ed.), New York, Plenum Press,pp. 149-188, 1986.

[0323] Temin, In: Gene Transfer, Kucherlapati (ed.), New York: PlenumPress, pp. 149-188, 1986.

[0324] Top et al., “Immunization with live types 7 and 4 adenovirusvaccines. II. Antibody response and protective effect against acuterespiratory disease due to adenovirus type 7,” J Infect. Dis.,124:155-160, 1971.

[0325] Tschemper et al., Gene, 10: 157, 1980.

[0326] Tur-Kaspa et al., “Use of electroporation to introducebiologically active foreign genes into primary rat hepatocytes,” Mol.Cell Biol., 6:716-718,1986.

[0327] U.S. Pat. No. 4,683,195

[0328] U.S. Pat. No. 4,683,202

[0329] U.S. Pat. No. 4,800,159

[0330] Varmus et al., “Retroviruses as mutagens: Insertion and excisionof a nontransforming provirus alter the expression of a residenttransforming provirus,” Cell, 25:23-36, 1981.

[0331] Wagner et al., Proc. Natl. Acad. Sci. 87(9):3410-3414, 1990.

[0332] Wagner et al., Science, 260:1510-1513,1993.

[0333] Walker et al., Proc. Nat'l Acad. Sci. USA 89:392-396 1992.

[0334] Wigler et al., Cell, 11: 223, 1977.

[0335] Wigler et al., Proc. Nat'l Acad. Sci. USA, 77: 3567, 1980.

[0336] Wong et al., “Appearance of P-lactamase activity in animal cellsupon liposome mediated gene transfer,” Gene, 10:87-94, 1980.

[0337] Wu and Wu, “Evidence for targeted gene delivery to HepG2 hepatomacells in vitro,” Biochem., 27:887-892, 1988.

[0338] Wu and Wu, “Receptor-mediated in vitro gene transfections by asoluble DNA carrier system,” J. Biol. Chem., 262:4429-4432, 1987.

[0339] Wu and Wu, Adv. Drug Delivery Rev., 12:159-167,1993.

[0340] Wu et al., Genomics, 4:560, 1989.

[0341] Yang et al., -“In vivo and in vitro gene transfer to mammaliansomatic cells by particle bombardment,” Proc. Natl. Acad. Sci USA,87:9568-9572, 1990.

[0342] Zelenin et al., “High-velocity mechanical DNA transfer of thechloramphenicol acetyltransferase gene into rodent liver, kidney andmammary gland cells in organ explants and in vivo,” FEBS Lett.,280:94-96, 1991.

1 193 1 4800 DNA Artificial Sequence Synthetic plasmid 1 aagcttacctcgatttgagg acgttacaag tattactgtt aaggagcgta gattaaaaaa 60 tgaaattgaaaatgaattat tagaattggc ttaaataaac agaatcacca aaaaggaata 120 gagtatgaagtttggaaata tttgtttttc gtatcaacca ccaggtgaaa ctcataagct 180 aagtaatggatcgctttgtt cggcttggta tcgcctcaga agagtagggt ttgatacata 240 ttggaccttagaacatcatt ttacagagtt tggtcttacg ggaaatttat ttgttgctgc 300 ggctaacctgttaggaagaa ctaaaacatt aaatgttggc actatggggg ttgttattcc 360 gacagcacacccagttcgac agttagaaga cgttttatta ttagatcaaa tgtcgaaagg 420 tcgttttaattttggaaccg ttcgagggct ataccataaa gattttcgag tatttggtgt 480 tgatatggaagagtctcgag caattactca aaatttctac cagatgataa tggaaagctt 540 acagacaggaaccattagct ctgatagtga ttacattcaa tttcctaagg ttgatgtata 600 tcccaaagtgtactcaaaaa atgtaccaac ctgtatgact gctgagtccg caagtacgac 660 agaatggctagcaatacaag ggctaccaat ggttcttagt tggattattg gtactaatga 720 aaaaaaagcacagatggaac tctataatga aattgcgaca gaatatggtc atgatatatc 780 taaaatagatcattgtatga cttatatttg ttctgttgat gatgatgcac aaaaggcgca 840 agatgtttgtcgggagtttc tgaaaaattg gtatgactca tatgtaaatg cgaccaatat 900 ctttaatgatagcaatcaaa ctcgtggtta tgattatcat aaaggtcaat ggcgtgattt 960 tgttttacaaggacatacaa acaccaatcg acgtgttgat tatagcaatg gtattaaccc 1020 tgtaggcactcctgagcagt gtattgaaat cattcaacgt gatattgatg caacgggtat 1080 tacaaacattacatgcggat ttgaagctaa tggaactgaa gatgaaataa ttgcttccat 1140 gcgacgctttatgacacaag tcgctccttt cttaaaagaa cctaaataaa ttacttattt 1200 gatactagagataataagga acaagttatg aaatttggat tattttttct aaactttcag 1260 aaagatggaataacatctga agaaacgttg gataatatgg taaagactgt cacgttaatt 1320 gattcaactaaatatcattt taatactgcc tttgttaatg aacatcactt ttcaaaaaat 1380 ggtattgttggagcacctat taccgcagct ggttttttat tagggttaac aaataaatta 1440 catattggttcattaaatca agtaattacc acccatcacc ctgtacgtgt agcagaagaa 1500 gccagtttattagatcaaat gtcagaggga cgcttcattc ttggttttag tgactgcgaa 1560 agtgatttcgaaatggaatt ttttagacgt catatctcat caaggcaaca acaatttgaa 1620 gcatgctatgaaataattaa tgacgcatta actacaggtt attgtcatcc ccaaaacgac 1680 ttttatgattttccaaaggt ttcaattaat ccacactgtt acagtgagaa tggacctaag 1740 caatatgtatccgctacatc aaaagaagtc gtcatgtggg cagcgaaaaa ggcactgcct 1800 ttaacatttaagtgggagga taatttagaa accaaagaac gctatgcaat tctatataat 1860 aaaacagcacaacaatatgg tattgatatt tcggatgttg atcatcaatt aactgtaatt 1920 gcgaacttaaatgctgatag aagtacggct caagaagaag tgagagaata cttaaaagac 1980 tatatcactgaaacttaccc tcaaatggac agagatgaaa aaattaactg cattattgaa 2040 gagaatgcagttgggtctca tgatgactat tatgaatcga caaaattagc agtggaaaaa 2100 acagggtctaaaaatatttt attatccttt gaatcaatgt ccgatattaa agatgtaaaa 2160 gatattattgatatgttgaa ccaaaaaatc gaaatgaatt taccataata aaattaaagg 2220 caatttctatattagattgc ctttttgggg atcctctaga aatattttat ctgattaata 2280 agatgagaattcactggccg tcgttttaca acgtcgtgac tgggaaaacc ctggcgttac 2340 ccaacttaatcgccttgcag cacatccccc tttcgccagc tggcgtaata gcgaagaggc 2400 ccgcaccgatcgcccttccc aacagttgcg cagcctgaat ggcgaatggc gcctgatgcg 2460 gtattttctccttacgcatc tgtgcggtat ttcacaccgc atatggtgca ctctcagtac 2520 aatctgctctgatgccgcat agttaagcca gccccgacac ccgccaacac ccgctgacgc 2580 gccctgacgggcttgtctgc tcccggcatc cgcttacaga caagctgtga ccgtctccgg 2640 gagctgcatgtgtcagaggt tttcaccgtc atcaccgaaa cgcgcgagac gaaagggcct 2700 cgtgatacgcctatttttat aggttaatgt catgataata atggtttctt agacgtcagg 2760 tggcacttttcggggaaatg tgcgcggaac ccctatttgt ttatttttct aaaaagcttc 2820 acgctgccgcaagcactcag ggcgcaaggg ctgctaaagg aagcggaaca cgtagaaagc 2880 cagtccgcagaaacggtgct gaccccggat gaatgtcagc tactgggcta tctggacaag 2940 ggaaaacgcaagcgcaaaga gaaagcaggt agcttgcagt gggcttacat ggcgatagct 3000 agactgggcggttttatgga cagcaagcga accggaattg ccagctgggg cgccctctgg 3060 taaggttgggaagccctgca aagtaaactg gatggctttc ttgccgccaa ggatctgatg 3120 gcgcaggggatcaagatctg atcaagagac aggatgagga tcgtttcgca tgattgaaca 3180 agatggattgcacgcaggtt ctccggccgc ttgggtggag aggctattcg gctatgactg 3240 ggcacaacagacaatcggct gctctgatgc cgccgtgttc cggctgtcag cgcaggggcg 3300 cccggttctttttgtcaaga ccgacctgtc cggtgccctg aatgaactgc aggacgaggc 3360 agcgcggctatcgtggctgg ccacgacggg cgttccttgc gcagctgtgc tcgacgttgt 3420 cactgaagcgggaagggact ggctgctatt gggcgaagtg ccggggcagg atctcctgtc 3480 atctcaccttgctcctgccg agaaagtatc catcatggct gatgcaatgc ggcggctgca 3540 tacgcttgatccggctacct gcccattcga ccaccaagcg aaacatcgca tcgagcgagc 3600 acgtactcggatggaagccg gtcttgtcga tcaggatgat ctggacgaag agcatcaggg 3660 gctcgcgccagccgaactgt tcgccaggct caaggcgcgc atgcccgacg gcgaggatct 3720 cgtcgtgacccatggcgatg cctgcttgcc gaatatcatg gtggaaaatg gccgcttttc 3780 tggattcatcgactgtggcc ggctgggtgt ggcggaccgc tatcaggaca tagcgttggc 3840 tacccgtgatattgctgaag agcttggcgg cgaatgggct gaccgcttcc tcgtgcttta 3900 cggtatcgccgctcccgatt cgcagcgcat cgccttctat cgccttcttg acgagttctt 3960 ctgagcgggactctggggtt cgaaatgacc gaccaagcga cgcccaacct gccatcacga 4020 gatttcgattccaccgccgc cttctatgaa aggttgggct tcggaatcgt tttccgggac 4080 gccggctggatgatcctcca gcgcggggat ctcatgctgg agttcttcgc ccaccccggg 4140 catgaccaaaatcccttaac gtgagttttc gttccactga gcgtcagacc ccgtagaaaa 4200 gatcaaaggatcttcttgag atcctttttt tctgcgcgta atctgctgct tgcaaacaaa 4260 aaaaccaccgctaccagcgg tggtttgttt gccggatcaa gagctaccaa ctctttttcc 4320 gaaggtaactggcttcagca gagcgcagat accaaatact gtccttctag tgtagccgta 4380 gttaggccaccacttcaaga actctgtagc accgcctaca tacctcgctc tgctaatcct 4440 gttaccagtggctgctgcca gtggcgataa gtcgtgtctt accgggttgg actcaagacg 4500 atagttaccggataaggcgc agcggtcggg ctgaacgggg ggttcgtgca cacagcccag 4560 cttggagcgaacgacctaca ccgaactgag atacctacag cgtgagctat gagaaagcgc 4620 cacgcttcccgaagggagaa aggcggacag gtatccggta agcggcaggg tcggaacagg 4680 agagcgcacgagggagcttc cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt 4740 tcgccacctctgacttgagc gtcgattttt gtgatgctcg tcaggggggc ggagcctatg 4800 2 50 DNAArtificial Sequence Synthetic Oligonucleotide 2 aagcttacct cgatttgaggacgttacaag tattactgtt aaggagcgta 50 3 50 DNA Artificial SequenceSynthetic Oligonucleotide 3 gattaaaaaa tgaaattgaa aatgaattat tagaattggcttaaataaac 50 4 50 DNA Artificial Sequence Synthetic Oligonucleotide 4agaatcacca aaaaggaata gagtatgaag tttggaaata tttgtttttc 50 5 50 DNAArtificial Sequence Synthetic Oligonucleotide 5 gtatcaacca ccaggtgaaactcataagct aagtaatgga tcgctttgtt 50 6 50 DNA Artificial SequenceSynthetic Oligonucleotide 6 cggcttggta tcgcctcaga agagtagggt ttgatacatattggacctta 50 7 50 DNA Artificial Sequence Synthetic Oligonucleotide 7gaacatcatt ttacagagtt tggtcttacg ggaaatttat ttgttgctgc 50 8 50 DNAArtificial Sequence Synthetic Oligonucleotide 8 ggctaacctg ttaggaagaactaaaacatt aaatgttggc actatggggg 50 9 50 DNA Artificial SequenceSynthetic Oligonucleotide 9 ttgttattcc gacagcacac ccagttcgac agttagaagacgttttatta 50 10 50 DNA Artificial Sequence Synthetic Oligonucleotide 10ttagatcaaa tgtcgaaagg tcgttttaat tttggaaccg ttcgagggct 50 11 50 DNAArtificial Sequence Synthetic Oligonucleotide 11 ataccataaa gattttcgagtatttggtgt tgatatggaa gagtctcgag 50 12 50 DNA Artificial SequenceSynthetic Oligonucleotide 12 caattactca aaatttctac cagatgataa tggaaagcttacagacagga 50 13 50 DNA Artificial Sequence Synthetic Oligonucleotide 13accattagct ctgatagtga ttacattcaa tttcctaagg ttgatgtata 50 14 50 DNAArtificial Sequence Synthetic Oligonucleotide 14 tcccaaagtg tactcaaaaaatgtaccaac ctgtatgact gctgagtccg 50 15 50 DNA Artificial SequenceSynthetic Oligonucleotide 15 caagtacgac agaatggcta gcaatacaag ggctaccaatggttcttagt 50 16 50 DNA Artificial Sequence Synthetic Oligonucleotide 16tggattattg gtactaatga aaaaaaagca cagatggaac tctataatga 50 17 50 DNAArtificial Sequence Synthetic Oligonucleotide 17 aattgcgaca gaatatggtcatgatatatc taaaatagat cattgtatga 50 18 50 DNA Artificial SequenceSynthetic Oligonucleotide 18 cttatatttg ttctgttgat gatgatgcac aaaaggcgcaagatgtttgt 50 19 50 DNA Artificial Sequence Synthetic Oligonucleotide 19cgggagtttc tgaaaaattg gtatgactca tatgtaaatg cgaccaatat 50 20 50 DNAArtificial Sequence Synthetic Oligonucleotide 20 ctttaatgat agcaatcaaactcgtggtta tgattatcat aaaggtcaat 50 21 50 DNA Artificial SequenceSynthetic Oligonucleotide 21 ggcgtgattt tgttttacaa ggacatacaa acaccaatcgacgtgttgat 50 22 50 DNA Artificial Sequence Synthetic Oligonucleotide 22tatagcaatg gtattaaccc tgtaggcact cctgagcagt gtattgaaat 50 23 50 DNAArtificial Sequence Synthetic Oligonucleotide 23 cattcaacgt gatattgatgcaacgggtat tacaaacatt acatgcggat 50 24 50 DNA Artificial SequenceSynthetic Oligonucleotide 24 ttgaagctaa tggaactgaa gatgaaataa ttgcttccatgcgacgcttt 50 25 50 DNA Artificial Sequence Synthetic Oligonucleotide 25atgacacaag tcgctccttt cttaaaagaa cctaaataaa ttacttattt 50 26 50 DNAArtificial Sequence Synthetic Oligonucleotide 26 gatactagag ataataaggaacaagttatg aaatttggat tattttttct 50 27 50 DNA Artificial SequenceSynthetic Oligonucleotide 27 aaactttcag aaagatggaa taacatctga agaaacgttggataatatgg 50 28 50 DNA Artificial Sequence Synthetic Oligonucleotide 28taaagactgt cacgttaatt gattcaacta aatatcattt taatactgcc 50 29 50 DNAArtificial Sequence Synthetic Oligonucleotide 29 tttgttaatg aacatcacttttcaaaaaat ggtattgttg gagcacctat 50 30 50 DNA Artificial SequenceSynthetic Oligonucleotide 30 taccgcagct ggttttttat tagggttaac aaataaattacatattggtt 50 31 50 DNA Artificial Sequence Synthetic Oligonucleotide 31cattaaatca agtaattacc acccatcacc ctgtacgtgt agcagaagaa 50 32 50 DNAArtificial Sequence Synthetic Oligonucleotide 32 gccagtttat tagatcaaatgtcagaggga cgcttcattc ttggttttag 50 33 50 DNA Artificial SequenceSynthetic Oligonucleotide 33 tgactgcgaa agtgatttcg aaatggaatt ttttagacgtcatatctcat 50 34 50 DNA Artificial Sequence Synthetic Oligonucleotide 34caaggcaaca acaatttgaa gcatgctatg aaataattaa tgacgcatta 50 35 50 DNAArtificial Sequence Synthetic Oligonucleotide 35 actacaggtt attgtcatccccaaaacgac ttttatgatt ttccaaaggt 50 36 50 DNA Artificial SequenceSynthetic Oligonucleotide 36 ttcaattaat ccacactgtt acagtgagaa tggacctaagcaatatgtat 50 37 50 DNA Artificial Sequence Synthetic Oligonucleotide 37ccgctacatc aaaagaagtc gtcatgtggg cagcgaaaaa ggcactgcct 50 38 50 DNAArtificial Sequence Synthetic Oligonucleotide 38 ttaacattta agtgggaggataatttagaa accaaagaac gctatgcaat 50 39 50 DNA Artificial SequenceSynthetic Oligonucleotide 39 tctatataat aaaacagcac aacaatatgg tattgatatttcggatgttg 50 40 50 DNA Artificial Sequence Synthetic Oligonucleotide 40atcatcaatt aactgtaatt gcgaacttaa atgctgatag aagtacggct 50 41 50 DNAArtificial Sequence Synthetic Oligonucleotide 41 caagaagaag tgagagaatacttaaaagac tatatcactg aaacttaccc 50 42 50 DNA Artificial SequenceSynthetic Oligonucleotide 42 tcaaatggac agagatgaaa aaattaactg cattattgaagagaatgcag 50 43 50 DNA Artificial Sequence Synthetic Oligonucleotide 43ttgggtctca tgatgactat tatgaatcga caaaattagc agtggaaaaa 50 44 50 DNAArtificial Sequence Synthetic Oligonucleotide 44 acagggtcta aaaatattttattatccttt gaatcaatgt ccgatattaa 50 45 50 DNA Artificial SequenceSynthetic Oligonucleotide 45 agatgtaaaa gatattattg atatgttgaa ccaaaaaatcgaaatgaatt 50 46 50 DNA Artificial Sequence Synthetic Oligonucleotide 46taccataata aaattaaagg caatttctat attagattgc ctttttgggg 50 47 50 DNAArtificial Sequence Synthetic Oligonucleotide 47 atcctctaga aatattttatctgattaata agatgagaat tcactggccg 50 48 50 DNA Artificial SequenceSynthetic Oligonucleotide 48 tcgttttaca acgtcgtgac tgggaaaacc ctggcgttacccaacttaat 50 49 50 DNA Artificial Sequence Synthetic Oligonucleotide 49cgccttgcag cacatccccc tttcgccagc tggcgtaata gcgaagaggc 50 50 50 DNAArtificial Sequence Synthetic Oligonucleotide 50 ccgcaccgat cgcccttcccaacagttgcg cagcctgaat ggcgaatggc 50 51 50 DNA Artificial SequenceSynthetic Oligonucleotide 51 gcctgatgcg gtattttctc cttacgcatc tgtgcggtatttcacaccgc 50 52 50 DNA Artificial Sequence Synthetic Oligonucleotide 52atatggtgca ctctcagtac aatctgctct gatgccgcat agttaagcca 50 53 50 DNAArtificial Sequence Synthetic Oligonucleotide 53 gccccgacac ccgccaacacccgctgacgc gccctgacgg gcttgtctgc 50 54 50 DNA Artificial SequenceSynthetic Oligonucleotide 54 tcccggcatc cgcttacaga caagctgtga ccgtctccgggagctgcatg 50 55 50 DNA Artificial Sequence Synthetic Oligonucleotide 55tgtcagaggt tttcaccgtc atcaccgaaa cgcgcgagac gaaagggcct 50 56 50 DNAArtificial Sequence Synthetic Oligonucleotide 56 cgtgatacgc ctatttttataggttaatgt catgataata atggtttctt 50 57 50 DNA Artificial SequenceSynthetic Oligonucleotide 57 agacgtcagg tggcactttt cggggaaatg tgcgcggaacccctatttgt 50 58 50 DNA Artificial Sequence Synthetic Oligonucleotide 58ttatttttct aaaaagcttc acgctgccgc aagcactcag ggcgcaaggg 50 59 50 DNAArtificial Sequence Synthetic Oligonucleotide 59 ctgctaaagg aagcggaacacgtagaaagc cagtccgcag aaacggtgct 50 60 50 DNA Artificial SequenceSynthetic Oligonucleotide 60 gaccccggat gaatgtcagc tactgggcta tctggacaagggaaaacgca 50 61 50 DNA Artificial Sequence Synthetic Oligonucleotide 61agcgcaaaga gaaagcaggt agcttgcagt gggcttacat ggcgatagct 50 62 50 DNAArtificial Sequence Synthetic Oligonucleotide 62 agactgggcg gttttatggacagcaagcga accggaattg ccagctgggg 50 63 50 DNA Artificial SequenceSynthetic Oligonucleotide 63 cgccctctgg taaggttggg aagccctgca aagtaaactggatggctttc 50 64 50 DNA Artificial Sequence Synthetic Oligonucleotide 64ttgccgccaa ggatctgatg gcgcagggga tcaagatctg atcaagagac 50 65 50 DNAArtificial Sequence Synthetic Oligonucleotide 65 aggatgagga tcgtttcgcatgattgaaca agatggattg cacgcaggtt 50 66 50 DNA Artificial SequenceSynthetic Oligonucleotide 66 ctccggccgc ttgggtggag aggctattcg gctatgactgggcacaacag 50 67 50 DNA Artificial Sequence Synthetic Oligonucleotide 67acaatcggct gctctgatgc cgccgtgttc cggctgtcag cgcaggggcg 50 68 50 DNAArtificial Sequence Synthetic Oligonucleotide 68 cccggttctt tttgtcaagaccgacctgtc cggtgccctg aatgaactgc 50 69 50 DNA Artificial SequenceSynthetic Oligonucleotide 69 aggacgaggc agcgcggcta tcgtggctgg ccacgacgggcgttccttgc 50 70 50 DNA Artificial Sequence Synthetic Oligonucleotide 70gggcgaagtg ccggggcagg atctcctgtc atctcacctt gctcctgccg 50 71 50 DNAArtificial Sequence Synthetic Oligonucleotide 71 gggcgaagtg ccggggcaggatctcctgtc atctcacctt gctcctgccg 50 72 50 DNA Artificial SequenceSynthetic Oligonucleotide 72 agaaagtatc catcatggct gatgcaatgc ggcggctgcatacgcttgat 50 73 50 DNA Artificial Sequence Synthetic Oligonucleotide 73ccggctacct gcccattcga ccaccaagcg aaacatcgca tcgagcgagc 50 74 50 DNAArtificial Sequence Synthetic Oligonucleotide 74 acgtactcgg atggaagccggtcttgtcga tcaggatgat ctggacgaag 50 75 50 DNA Artificial SequenceSynthetic Oligonucleotide 75 agcatcaggg gctcgcgcca gccgaactgt tcgccaggctcaaggcgcgc 50 76 50 DNA Artificial Sequence Synthetic Oligonucleotide 76atgcccgacg gcgaggatct cgtcgtgacc catggcgatg cctgcttgcc 50 77 50 DNAArtificial Sequence Synthetic Oligonucleotide 77 gaatatcatg gtggaaaatggccgcttttc tggattcatc gactgtggcc 50 78 50 DNA Artificial SequenceSynthetic Oligonucleotide 78 ggctgggtgt ggcggaccgc tatcaggaca tagcgttggctacccgtgat 50 79 50 DNA Artificial Sequence Synthetic Oligonucleotide 79attgctgaag agcttggcgg cgaatgggct gaccgcttcc tcgtgcttta 50 80 50 DNAArtificial Sequence Synthetic Oligonucleotide 80 cggtatcgcc gctcccgattcgcagcgcat cgccttctat cgccttcttg 50 81 50 DNA Artificial SequenceSynthetic Oligonucleotide 81 acgagttctt ctgagcggga ctctggggtt cgaaatgaccgaccaagcga 50 82 50 DNA Artificial Sequence Synthetic Oligonucleotide 82cgcccaacct gccatcacga gatttcgatt ccaccgccgc cttctatgaa 50 83 50 DNAArtificial Sequence Synthetic Oligonucleotide 83 aggttgggct tcggaatcgttttccgggac gccggctgga tgatcctcca 50 84 50 DNA Artificial SequenceSynthetic Oligonucleotide 84 gcgcggggat ctcatgctgg agttcttcgc ccaccccgggcatgaccaaa 50 85 50 DNA Artificial Sequence Synthetic Oligonucleotide 85atcccttaac gtgagttttc gttccactga gcgtcagacc ccgtagaaaa 50 86 50 DNAArtificial Sequence Synthetic Oligonucleotide 86 gatcaaagga tcttcttgagatcctttttt tctgcgcgta atctgctgct 50 87 50 DNA Artificial SequenceSynthetic Oligonucleotide 87 tgcaaacaaa aaaaccaccg ctaccagcgg tggtttgtttgccggatcaa 50 88 50 DNA Artificial Sequence Synthetic Oligonucleotide 88gagctaccaa ctctttttcc gaaggtaact ggcttcagca gagcgcagat 50 89 51 DNAArtificial Sequence Synthetic Oligonucleotide 89 accaaatact gtccttctagtgtagccgta gttaggccac cacttcaatg a 51 90 50 DNA Artificial SequenceSynthetic Oligonucleotide 90 actctgtagc accgcctaca tacctcgctc tgctaatcctgttaccagtg 50 91 50 DNA Artificial Sequence Synthetic Oligonucleotide 91gctgctgcca gtggcgataa gtcgtgtctt accgggttgg actcaagacg 50 92 50 DNAArtificial Sequence Synthetic Oligonucleotide 92 atagttaccg gataaggcgcagcggtcggg ctgaacgggg ggttcgtgca 50 93 50 DNA Artificial SequenceSynthetic Oligonucleotide 93 cacagcccag cttggagcga acgacctaca ccgaactgagatacctacag 50 94 50 DNA Artificial Sequence Synthetic Oligonucleotide 94cgtgagctat gagaaagcgc cacgcttccc gaagggagaa aggcggacag 50 95 50 DNAArtificial Sequence Synthetic Oligonucleotide 95 gtatccggta agcggcagggtcggaacagg agagcgcacg agggagcttc 50 96 50 DNA Artificial SequenceSynthetic Oligonucleotide 96 cagggggaaa cgcctggtat ctttatagtc ctgtcgggtttcgccacctc 50 97 50 DNA Artificial Sequence Synthetic Oligonucleotide 97tgacttgagc gtcgattttt gtgatgctcg tcaggggggc ggagcctatg 50 98 50 DNAArtificial Sequence Synthetic Oligonucleotide 98 catcacaaaa atcgacgctcaagtcagagg tggcgaaacc cgacaggact 50 99 50 DNA Artificial SequenceSynthetic Oligonucleotide 99 ataaagatac caggcgtttc cccctggaag ctccctcgtgcgctctcctg 50 100 50 DNA Artificial Sequence Synthetic Oligonucleotide100 ttccgaccct gccgcttacc ggatacctgt ccgcctttct cccttcggga 50 101 50 DNAArtificial Sequence Synthetic Oligonucleotide 101 agcgtggcgc tttctcatagctcacgctgt aggtatctca gttcggtgta 50 102 50 DNA Artificial SequenceSynthetic Oligonucleotide 102 ggtcgttcgc tccaagctgg gctgtgtgcacgaacccccc gttcagcccg 50 103 50 DNA Artificial Sequence SyntheticOligonucleotide 103 accgctgcgc cttatccggt aactatcgtc ttgagtccaacccggtaaga 50 104 50 DNA Artificial Sequence Synthetic Oligonucleotide104 cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc 50 105 50 DNAArtificial Sequence Synthetic Oligonucleotide 105 gaggtatgta ggcggtgctacagagttctt gaagtggtgg cctaactacg 50 106 50 DNA Artificial SequenceSynthetic Oligonucleotide 106 gctacactag aaggacagta tttggtatctgcgctctgct gaagccagtt 50 107 50 DNA Artificial Sequence SyntheticOligonucleotide 107 accttcggaa aaagagttgg tagctcttga tccggcaaacaaaccaccgc 50 108 50 DNA Artificial Sequence Synthetic Oligonucleotide108 tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa 50 109 50 DNAArtificial Sequence Synthetic Oligonucleotide 109 aaggatctca agaagatcctttgatctttt ctacggggtc tgacgctcag 50 110 50 DNA Artificial SequenceSynthetic Oligonucleotide 110 tggaacgaaa actcacgtta agggattttggtcatgcccg gggtgggcga 50 111 50 DNA Artificial Sequence SyntheticOligonucleotide 111 agaactccag catgagatcc ccgcgctgga ggatcatccagccggcgtcc 50 112 50 DNA Artificial Sequence Synthetic Oligonucleotide112 cggaaaacga ttccgaagcc caacctttca tagaaggcgg cggtggaatc 50 113 50 DNAArtificial Sequence Synthetic Oligonucleotide 113 gaaatctcgt gatggcaggttgggcgtcgc ttggtcggtc atttcgaacc 50 114 50 DNA Artificial SequenceSynthetic Oligonucleotide 114 ccagagtccc gctcagaaga actcgtcaagaaggcgatag aaggcgatgc 50 115 50 DNA Artificial Sequence SyntheticOligonucleotide 115 gctgcgaatc gggagcggcg ataccgtaaa gcacgaggaagcggtcagcc 50 116 50 DNA Artificial Sequence Synthetic Oligonucleotide116 cattcgccgc caagctcttc agcaatatca cgggtagcca acgctatgtc 50 117 50 DNAArtificial Sequence Synthetic Oligonucleotide 117 ctgatagcgg tccgccacacccagccggcc acagtcgatg aatccagaaa 50 118 50 DNA Artificial SequenceSynthetic Oligonucleotide 118 agcggccatt ttccaccatg atattcggcaagcaggcatc gccatgggtc 50 119 50 DNA Artificial Sequence SyntheticOligonucleotide 119 acgacgagat cctcgccgtc gggcatgcgc gccttgagcctggcgaacag 50 120 50 DNA Artificial Sequence Synthetic Oligonucleotide120 ttcggctggc gcgagcccct gatgctcttc gtccagatca tcctgatcga 50 121 50 DNAArtificial Sequence Synthetic Oligonucleotide 121 caagaccggc ttccatccgagtacgtgctc gctcgatgcg atgtttcgct 50 122 50 DNA Artificial SequenceSynthetic Oligonucleotide 122 tggtggtcga atgggcaggt agccggatcaagcgtatgca gccgccgcat 50 123 50 DNA Artificial Sequence SyntheticOligonucleotide 123 tgcatcagcc atgatggata ctttctcggc aggagcaaggtgagatgaca 50 124 50 DNA Artificial Sequence Synthetic Oligonucleotide124 ggagatcctg ccccggcact tcgcccaata gcagccagtc ccttcccgct 50 125 50 DNAArtificial Sequence Synthetic Oligonucleotide 125 tcagtgacaa cgtcgagcacagctgcgcaa ggaacgcccg tcgtggccag 50 126 50 DNA Artificial SequenceSynthetic Oligonucleotide 126 ccacgatagc cgcgctgcct cgtcctgcagttcattcagg gcaccggaca 50 127 50 DNA Artificial Sequence SyntheticOligonucleotide 127 ggtcggtctt gacaaaaaga accgggcgcc cctgcgctgacagccggaac 50 128 50 DNA Artificial Sequence Synthetic Oligonucleotide128 acggcggcat cagagcagcc gattgtctgt tgtgcccagt catagccgaa 50 129 50 DNAArtificial Sequence Synthetic Oligonucleotide 129 tagcctctcc acccaagcggccggagaacc tgcgtgcaat ccatcttgtt 50 130 50 DNA Artificial SequenceSynthetic Oligonucleotide 130 caatcatgcg aaacgatcct catcctgtctcttgatcaga tcttgatccc 50 131 50 DNA Artificial Sequence SyntheticOligonucleotide 131 ctgcgccatc agatccttgg cggcaagaaa gccatccagtttactttgca 50 132 50 DNA Artificial Sequence Synthetic Oligonucleotide132 gggcttccca accttaccag agggcgcccc agctggcaat tccggttcgc 50 133 50 DNAArtificial Sequence Synthetic Oligonucleotide 133 ttgctgtcca taaaaccgcccagtctagct atcgccatgt aagcccactg 50 134 50 DNA Artificial SequenceSynthetic Oligonucleotide 134 caagctacct gctttctctt tgcgcttgcgttttcccttg tccagatagc 50 135 50 DNA Artificial Sequence SyntheticOligonucleotide 135 ccagtagctg acattcatcc ggggtcagca ccgtttctgcggactggctt 50 136 50 DNA Artificial Sequence Synthetic Oligonucleotide136 tctacgtgtt ccgcttcctt tagcagccct tgcgccctga gtgcttgcgg 50 137 50 DNAArtificial Sequence Synthetic Oligonucleotide 137 cagcgtgaag ctttttagaaaaataaacaa ataggggttc cgcgcacatt 50 138 50 DNA Artificial SequenceSynthetic Oligonucleotide 138 tccccgaaaa gtgccacctg acgtctaagaaaccattatt atcatgacat 50 139 50 DNA Artificial Sequence SyntheticOligonucleotide 139 taacctataa aaataggcgt atcacgaggc cctttcgtctcgcgcgtttc 50 140 50 DNA Artificial Sequence Synthetic Oligonucleotide140 ggtgatgacg gtgaaaacct ctgacacatg cagctcccgg agacggtcac 50 141 50 DNAArtificial Sequence Synthetic Oligonucleotide 141 agcttgtctg taagcggatgccgggagcag acaagcccgt cagggcgcgt 50 142 50 DNA Artificial SequenceSynthetic Oligonucleotide 142 cagcgggtgt tggcgggtgt cggggctggcttaactatgc ggcatcagag 50 143 50 DNA Artificial Sequence SyntheticOligonucleotide 143 cagattgtac tgagagtgca ccatatgcgg tgtgaaataccgcacagatg 50 144 50 DNA Artificial Sequence Synthetic Oligonucleotide144 cgtaaggaga aaataccgca tcaggcgcca ttcgccattc aggctgcgca 50 145 50 DNAArtificial Sequence Synthetic Oligonucleotide 145 actgttggga agggcgatcggtgcgggcct cttcgctatt acgccagctg 50 146 50 DNA Artificial SequenceSynthetic Oligonucleotide 146 gcgaaagggg gatgtgctgc aaggcgattaagttgggtaa cgccagggtt 50 147 50 DNA Artificial Sequence SyntheticOligonucleotide 147 ttcccagtca cgacgttgta aaacgacggc cagtgaattctcatcttatt 50 148 50 DNA Artificial Sequence Synthetic Oligonucleotide148 aatcagataa aatatttcta gaggatcccc aaaaaggcaa tctaatatag 50 149 50 DNAArtificial Sequence Synthetic Oligonucleotide 149 aaattgcctt taattttattatggtaaatt catttcgatt ttttggttca 50 150 50 DNA Artificial SequenceSynthetic Oligonucleotide 150 acatatcaat aatatctttt acatctttaatatcggacat tgattcaaag 50 151 50 DNA Artificial Sequence SyntheticOligonucleotide 151 gataataaaa tatttttaga ccctgttttt tccactgctaattttgtcga 50 152 50 DNA Artificial Sequence Synthetic Oligonucleotide152 ttcataatag tcatcatgag acccaactgc attctcttca ataatgcagt 50 153 50 DNAArtificial Sequence Synthetic Oligonucleotide 153 taattttttc atctctgtccatttgagggt aagtttcagt gatatagtct 50 154 50 DNA Artificial SequenceSynthetic Oligonucleotide 154 tttaagtatt ctctcacttc ttcttgagccgtacttctat cagcatttaa 50 155 50 DNA Artificial Sequence SyntheticOligonucleotide 155 gttcgcaatt acagttaatt gatgatcaac atccgaaatatcaataccat 50 156 50 DNA Artificial Sequence Synthetic Oligonucleotide156 attgttgtgc tgttttatta tatagaattg catagcgttc tttggtttct 50 157 50 DNAArtificial Sequence Synthetic Oligonucleotide 157 aaattatcct cccacttaaatgttaaaggc agtgcctttt tcgctgccca 50 158 50 DNA Artificial SequenceSynthetic Oligonucleotide 158 catgacgact tcttttgatg tagcggatacatattgctta ggtccattct 50 159 50 DNA Artificial Sequence SyntheticOligonucleotide 159 cactgtaaca gtgtggatta attgaaacct ttggaaaatcataaaagtcg 50 160 50 DNA Artificial Sequence Synthetic Oligonucleotide160 ttttggggat gacaataacc tgtagttaat gcgtcattaa ttatttcata 50 161 50 DNAArtificial Sequence Synthetic Oligonucleotide 161 gcatgcttca aattgttgttgccttgatga gatatgacgt ctaaaaaatt 50 162 50 DNA Artificial SequenceSynthetic Oligonucleotide 162 ccatttcgaa atcactttcg cagtcactaaaaccaagaat gaagcgtccc 50 163 50 DNA Artificial Sequence SyntheticOligonucleotide 163 tctgacattt gatctaataa actggcttct tctgctacacgtacagggtg 50 164 50 DNA Artificial Sequence Synthetic Oligonucleotide164 atgggtggta attacttgat ttaatgaacc aatatgtaat ttatttgtta 50 165 50 DNAArtificial Sequence Synthetic Oligonucleotide 165 accctaataa aaaaccagctgcggtaatag gtgctccaac aataccattt 50 166 50 DNA Artificial SequenceSynthetic Oligonucleotide 166 tttgaaaagt gatgttcatt aacaaaggcagtattaaaat gatatttagt 50 167 50 DNA Artificial Sequence SyntheticOligonucleotide 167 tgaatcaatt aacgtgacag tctttaccat attatccaacgtttcttcag 50 168 50 DNA Artificial Sequence Synthetic Oligonucleotide168 atgttattcc atctttctga aagtttagaa aaaataatcc aaatttcata 50 169 50 DNAArtificial Sequence Synthetic Oligonucleotide 169 acttgttcct tattatctctagtatcaaat aagtaattta tttaggttct 50 170 50 DNA Artificial SequenceSynthetic Oligonucleotide 170 tttaagaaag gagcgacttg tgtcataaagcgtcgcatgg aagcaattat 50 171 50 DNA Artificial Sequence SyntheticOligonucleotide 171 ttcatcttca gttccattag cttcaaatcc gcatgtaatgtttgtaatac 50 172 50 DNA Artificial Sequence Synthetic Oligonucleotide172 ccgttgcatc aatatcacgt tgaatgattt caatacactg ctcaggagtg 50 173 50 DNAArtificial Sequence Synthetic Oligonucleotide 173 cctacagggt taataccattgctataatca acacgtcgat tggtgtttgt 50 174 50 DNA Artificial SequenceSynthetic Oligonucleotide 174 atgtccttgt aaaacaaaat cacgccattgacctttatga taatcataac 50 175 50 DNA Artificial Sequence SyntheticOligonucleotide 175 cacgagtttg attgctatca ttaaagatat tggtcgcatttacatatgag 50 176 50 DNA Artificial Sequence Synthetic Oligonucleotide176 tcataccaat ttttcagaaa ctcccgacaa acatcttgcg ccttttgtgc 50 177 50 DNAArtificial Sequence Synthetic Oligonucleotide 177 atcatcatca acagaacaaatataagtcat acaatgatct attttagata 50 178 50 DNA Artificial SequenceSynthetic Oligonucleotide 178 tatcatgacc atattctgtc gcaatttcattatagagttc catctgtgct 50 179 50 DNA Artificial Sequence SyntheticOligonucleotide 179 tttttttcat tagtaccaat aatccaacta agaaccattggtagcccttg 50 180 50 DNA Artificial Sequence Synthetic Oligonucleotide180 tattgctagc cattctgtcg tacttgcgga ctcagcagtc atacaggttg 50 181 50 DNAArtificial Sequence Synthetic Oligonucleotide 181 gtacattttt tgagtacactttgggatata catcaacctt aggaaattga 50 182 50 DNA Artificial SequenceSynthetic Oligonucleotide 182 atgtaatcac tatcagagct aatggttcctgtctgtaagc tttccattat 50 183 50 DNA Artificial Sequence SyntheticOligonucleotide 183 catctggtag aaattttgag taattgctcg agactcttccatatcaacac 50 184 50 DNA Artificial Sequence Synthetic Oligonucleotide184 caaatactcg aaaatcttta tggtatagcc ctcgaacggt tccaaaatta 50 185 50 DNAArtificial Sequence Synthetic Oligonucleotide 185 aaacgacctt tcgacatttgatctaataat aaaacgtctt ctaactgtcg 50 186 50 DNA Artificial SequenceSynthetic Oligonucleotide 186 aactgggtgt gctgtcggaa taacaacccccatagtgcca acatttaatg 50 187 50 DNA Artificial Sequence SyntheticOligonucleotide 187 ttttagttct tcctaacagg ttagccgcag caacaaataaatttcccgta 50 188 50 DNA Artificial Sequence Synthetic Oligonucleotide188 agaccaaact ctgtaaaatg atgttctaag gtccaatatg tatcaaaccc 50 189 50 DNAArtificial Sequence Synthetic Oligonucleotide 189 tactcttctg aggcgataccaagccgaaca aagcgatcca ttacttagct 50 190 50 DNA Artificial SequenceSynthetic Oligonucleotide 190 tatgagtttc acctggtggt tgatacgaaaaacaaatatt tccaaacttc 50 191 50 DNA Artificial Sequence SyntheticOligonucleotide 191 atactctatt cctttttggt gattctgttt atttaagccaattctaataa 50 192 50 DNA Artificial Sequence Synthetic Oligonucleotide192 ttcattttca atttcatttt ttaatctacg ctccttaaca gtaatacttg 50 193 50 DNAArtificial Sequence Synthetic Oligonucleotide 193 taacgtcctc aaatcgaggtaagcttcata ggctccgccc ccctgacgag 50

1. A method for the synthesis of a replication-competent,double-stranded polynucleotide, wherein said polynucleotide comprises anorigin of replication, a first coding region and a first regulatoryelement directing the expression of said first coding region, comprisingthe steps of: (a) generating a first set of oligonucleotidescorresponding to the entire plus strand of said double-strandedpolynucleotide; (b) generating a second set of oligonucleotidescorresponding to the entire minus strand of said double-strandedpolynucleotide; and (c) annealing said first and said second set ofoligonucleotides; wherein each of said oligonucleotides of said secondset of oligonucleotides overlaps with and hybridizes to twocomplementary oligonucleotides of said first set of oligonucleotides,except that two oligonucleotides at a 5′ or 3′ end of saiddouble-stranded polynucleotide will hybridize with only onecomplementary oligonucleotide.
 2. The method of claim 1, furthercomprising the step of treating said annealed oligonucleotides with aligating enzyme to generate continuous strands of said double-strandedpolynucleotide.
 3. The method of claim 1, further comprising the step ofamplifying said double-stranded polynucleotide.
 4. The method of claim1, wherein said double-stranded polynucleotide comprises 100, 200, 300,400, 500, 600, 700, 800, 900, 1000, 5000, 10×10³, 20×10³, 30×10³,40×10³, 50×10³, 60×10³, 70×10³, 80×10³, 90×10³, 1×10⁴, 1×10⁵, 1×10⁶,1×10⁷, 1×10⁸, 1×10⁹ or 1×10¹⁰ base pairs in length.
 5. The method ofclaim 1, wherein said first regulatory element is a promoter.
 6. Themethod of claim 5, wherein said double-stranded polynucleotide comprisesa second regulatory element, said second regulatory element being apolyadenylation signal.
 7. The method of claim 1, wherein saiddouble-stranded polynucleotide comprises a plurality of coding regionsand a plurality of regulatory elements.
 8. The method of claim 7,wherein said coding regions encode products that comprise a biochemicalpathway.
 9. The method of claim 8, wherein said biochemical pathway isglycolysis.
 10. The method of claim 9, wherein said coding regionsencode enzymes selected from the group consisting of hexokinase,phosphohexose isomerase, phosphofructokinase-1, aldolase,triose-phosphate isomerase, glyceraldehyde-3-phosphate dehydrogenase,phosphoglycerate kinase, phosphoglycerate mutase, enolase and pyruvatekinase.
 11. The method of claim 8, wherein said biochemical pathway islipid synthesis.
 12. The method claim 7, wherein said biochemicalpathway is cofactor synthesis.
 13. The method of claim 13, wherein saidpathway involves lipoic acid.
 14. The method of claim 13, wherein saidbiochemical pathway is riboflavin synthesis.
 15. The method of claim 7,wherein said biochemical pathway is nucleotide synthesis.
 16. The methodof claim 15, wherein said nucleotide is a purine.
 17. The method ofclaim 15, wherein said nucleotide is a pyrimidine.
 18. The method ofclaim 7, wherein said coding regions encode enzymes involved in acellular process selected from the group consisting of cell division,chaperone, detoxification, peptide secretion, energy metabolism,regulatory function, DNA replication, transcription, RNA processing andtRNA modification.
 19. The method of claim 18, wherein said energymetabolism is oxidative phosphorylation.
 20. The method of claim 1,wherein said double-stranded polynucleotide is a DNA.
 21. The method ofclaim 1, wherein said double-stranded polynucleotide is an RNA.
 22. Themethod of claim 1, wherein said double-stranded polynucleotide is anexpression construct.
 23. The method of claim 22, wherein saidexpression construct is a bacterial expression construct.
 24. The methodof claim 22, wherein said expression construct is a mammalian expressionconstruct.
 25. The method of claim 17, wherein said expression constructis a viral expression construct.
 26. The method of claim 1, wherein saiddouble-stranded polynucleotide comprises a genome selected from thegroup consisting of bacterial genome, yeast genome, viral genome,mammalian genome, amphibian genome and avian genome.
 27. The method ofclaim 1, wherein said overlap between the oligonucleotides of said firstand said second set of oligonucleotides is between about 5 base pairsand about 75 base pairs.
 28. The method of claim 1, wherein said overlapis about 10 base pairs, about 15 base pairs, about 20 base pairs, about25 base pairs, about 30 base pairs, about 35 base pairs, about 40 basepairs, about 45 base pairs, about 50 base pairs, about 55 base pairs,about 60 base pairs, about 65 base pairs, or about 70 base pairs. 29.The method of claim 5, wherein said promoter is selected from the groupconsisting of CMV IE, SV40 IE, RSV, β-actin, tetracycline regulatableand ecdysone regulatable.
 30. The method of claim 26, wherein saidgenome is a viral genome.
 31. The method of claim 30, wherein said viralgenome is selected from the group consisting of retrovirus, adenovirus,vaccinia virus, herpesvirus and adeno-associated virus.
 32. The methodof claim 1, wherein said double-stranded polynucleotide is a chromosome.33. A method of producing a viral particle comprising the steps of: (a)providing a host cell; (b) transforming said host cell with anartificial viral genome prepared by: (i) generating a first set ofoligonucleotides corresponding to the entire plus strand of said viralgenome; (ii) generating a second set of oligonucleotides correspondingto the entire minus strand of said viral genome; and (iii) annealingsaid first and said second set of oligonucleotides; wherein each of saidoligonucleotides of said second set of oligonucleotides overlaps withand hybridizes to two complementary oligonucleotides of said first setof oligonucleotides, except that two oligonucleotides at a 5′ or 3′ endof said viral genome will hybridize with only one complementaryoligonucleotide; and (c) culturing said transformed host cell underconditions such that said viral particle is expressed.
 34. The method ofclaim 33, wherein said viral genome is selected from the groupconsisting of retrovirus, adenovirus, vaccinia virus, herpesvirus andadeno-associated virus.
 35. A method of producing an artificial genome,wherein said chromosome comprises all coding regions and regulatoryelements found in a corresponding natural chromosome, comprising thesteps of: (a) generating a first set of oligonucleotides correspondingto the entire plus strand of said chromosome; (b) generating a secondset of oligonucleotides corresponding to the entire minus strand of saidchromosome; and (c) annealing said first and said second set ofoligonucleotides; wherein each of said oligonucleotides of said secondset of oligonucleotides overlaps with and hybridizes to twocomplementary oligonucleotides of said first set of oligonucleotides,except that two oligonucleotides at a 5′ or 3′ end of said chromosomewill hybridize with only one complementary oligonucleotide.
 36. Themethod of claim 35, wherein said corresponding natural chromosome is ahuman mitochondrial genome.
 37. The method of claim 35, wherein saidcorresponding natural chromosome is a chloroplast genome.
 38. A methodof producing an artificial genetic system, wherein said system comprisesall coding regions and regulatory elements found in a correspondingnatural biochemical pathway, comprising the steps of: (a) generating afirst set of oligonucleotides corresponding to the entire plus strand ofsaid chromosome; (b) generating a second set of oligonucleotidescorresponding to the entire minus strand of said chromosome; and (c)annealing said first and said second set of oligonucleotides; whereineach of said oligonucleotides of said second set of oligonucleotidesoverlaps with and hybridizes to two complementary oligonucleotides ofsaid first set of oligonucleotides, except that two oligonucleotides ata 5′ or 3′ end of said chromosome will hybridize with only onecomplementary oligonucleotide wherein expression of said biochemicalpathway coding regions results in the expression of a group of enzymesthat serially metabolize a compound.
 39. The method of claim 38, whereinsaid biochemical pathway comprises the activities required forglycolysis.
 40. The method of claim 38, wherein said biochemical pathwaycomprises the enzymes required for electron transport.
 41. The method ofclaim 38, wherein said biochemical pathway comprises the enzymeactivities required for photosynthesis.